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Date of Deposit: December 15, 1999 



PATENT APPLICATION 
Attorney Docket No. 2065 1-001 (DWW-1) 



Identification of Compounds for Modulating 
Dimeric Receptors 

Related Applications 

This application claims priority to Canadian Patent Application No. 2,273,576, 
filed May 27, 1999, which is incorporated herein in its entirety. 

Field of the Invention 

The invention relates to methods of using the three dimensional structure of an 
intrinsically covalent dimeric receptor, preferably the insulin receptor, to identify test 
compounds that will interact with the dimeric receptor and modulate its activity. The 
invention also includes compounds identified using the methods of the invention. 

Background of the Invention 

Covalent dimeric receptors are found on almost all cells in mammals. These 
receptors include IR (insulin receptor), IGF-1 R (insulin-like growth factor I) and IRR 
(the insulin receptor-related receptor). In the case of IR, insulin binding to IR is essential 
for its manifold effects such as glucose homeostasis, increased protein synthesis, growth, 
and development in mammals. IR belongs to the superfamily of transmembrane receptor 
TKs that include the monomelic epidermal growth factor receptor (EGFR) and platelet- 
derived growth factor receptor (PDGFR). In contrast, IR and its homologues IGF-I R and 
IRR axe sub-types of this family that are intrinsic disulfide-linked dimers of two 
heterodimers of the form (aJJ) 2 (1,2). Monomeric receptor TKs are inactive, but are 
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activated by ligand-induced dimerization that results in autophosphorylation. Dimeric 
IR-like TKs are also inactive, and are activated by ligand binding without further 
dimerization. Insulin binding to the extracellular domain of IR results in 
autophosphorylation of specific tyrosines in the cytoplasmic domain to initiate an 
5 intracellular signal transduction cascade (3). However, the structural basis for the 
mechanism of IR activation by extracellular insulin binding has not been elucidated 
because the quaternary structure of IR was unknown. Only some of the smaller domains 
have yielded high resolution structural information. 

Diabetes may be caused by mutant IR (eg. acanthosis nigrican or leprechaunism. 
10 Insulin resistance leading to diabetes or similar symptoms may also occur.). Diseases are 
also caused by insufficient amounts of IR ligand. For example, in diabetes, the pancreas 
n produces insufficient amounts of insulin. Insulin activates IR and allows cells to absorb 

$ and store glucose. In the absence of adequate insulin, glucose accumulates in excessive 

?h amounts in the blood (hyperglycemia). The symptoms of diabetes may include poor 

iu 1 5 blood circulation, blindness and organ damage. These symptoms often lead to premature 
P death. 

¥ 

M= Diabetes is presently treated by insulin replacement therapy. This treatment has 

£i, been very successful, but it still has problems such as glycemic control. Poor glycemic 

!g control can cause retinopathy, poor blood circulation and the other problems associated 

'A 20 with diabetes. It is also difficult to formulate insulin for slow release. Modified insulins 
have been created in an attempt to address problems with insulin therapy. In some cases, 
"super-insulins" have been created to increase the activation of insulin receptor by its 
ligand. In other cases, binding to insulin receptor is not substantially increased, but the 
ligand has more favourable formulation properties. For example, in Humalog™, a lysine 
25 and a proline in insulin are switched to provide more favourable solubility characteristics. 

These drug design strategies have been based on limited information, such as the 
chemical properties of the insulin molecule. In some cases, insulin has been randomly 
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modified and then assayed to determine the effects on insulin activity. While there has 
been success in producing insulin variants, both of these approaches are time consuming 
because variants are made without a clear understanding of the effect of the variation on 
binding to insulin receptor. There is a need to obtain additional information about the 
insulin receptor in order to provide a rational basis for drug design. 

For example, it would be helpful if the quaternary structure, including the ligand 
binding site, of IR was available and characterized to the detail of amino acids. However, 
it is very difficult to obtain information about the quaternary structure of dimeric 
receptors. For example, large transmembrane proteins such as cell surface hormone 
receptors have been difficult to crystallize as intact molecules for high-resolution 
structural study. They are also too large for NMR spectroscopy. The 480-kDa insulin 
receptor (IR) has thus not been crystallized as an intact molecule, and its quaternary 
structure remains unknown to date. 

Summary of the Invention 

We have obtained the quaternary structure of IR. We used low-dose low- 
temperature dark field scanning transmission electron microscopy (STEM). Using 
electron micrographs of the insulin-IR complex we have reconstructed the three- 
dimensional quaternary structure of the intact receptor complexed with gold-labeled 
insulin ligand. Although IR has been purified and studied for over 15 years, this is the 
first 3D reconstruction of its entire dimeric structure. Contiguous high densities within 
the 3D structure indicate a two-fold symmetry for this dimeric membrane receptor, as 
well as a logical sequence for its biochemical subdomains from the observed binding of a 
single insulin on the ectodomain to the juxtaposition of the pair of intrinsic tyrosine 
kinases (TKs) of the intracellular domain. 



We detenninedstractural relationships of the IR subdomainsin the 3D 
reconstruction of IR and a structural basis for IR activation by insulin. In the absence of 
ATP which is required to complete the activation of the IR tyrosine kinase, the structure of 
this insulin-boundIR can be considered to be in a transitional state, with its kinase domains 
5 intermediate between the inactive and activated structures observed by x-ray 
crystallography (4). 

The quaternary structure of IR, fitted with the atomic co-ordinates of highly 
analogous domains of IR has resulted in a detailed description of the insulin binding site on 
the insulin receptor. Moreover, the combination of structural detail from 20 A to atomic 
1 0 resolution yielded a self-consistent model for the mechanism of the initial phase of insulin 
action on binding to effect intracellularreceptor tyrosine kinase activation. 

P 

The complete IR model provides a simple mechanical paradigm for the reversible 
|J transmembrane signalling response. It explains the need for the complexity of structural 

M; components to control both inhibition and accommodation of tyrosine kinase activation. It 
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ifi 15 gives ready structural explanations for many normal effects, for various mutations and for 
3f mild chemical reduction of the insulin receptor. It thus provides a comprehensive structural 

basis for the mechanics of transmembrane signal transduction for the intrinsically dimeric 

W 

H insulin-like membrane receptors. 

B 

^ The details of the insulin binding site provide an explanation of binding of normal 

20 human insulin as well as of the lesser or greater binding of insulin from other animals to the 
human IR and explains the binding of modified insulins such as "super-insulins", 
Humalog™ and other insulin analogs. 

One aspect of the invention includes a method of identifying a compound that 
modulates insulin receptor activity, including producing a compound that interacts with 
25 all or part of the fitted quaternary structure of insulin receptor or a fragment or derivative 
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thereof and which thereby modulates insulin receptor activity. In one embodiment, the 
method further includes synthesizing the compounds. 

Another aspect of the invention includes a method of identifying a compound that 
modulates insulin receptor activity, including comparing the structure of a compound for 
modulating insulin receptor activity to all or part of the fitted quaternary structure of 
insulin receptor or a fragment or derivative thereof to determine whether the compound is 
likely to modulate insulin receptor activity. 

The method may further include determining whether the compound modulates 
the activity of the insulin receptor or a fragment or a derivative thereof having IR activity 
in an in vivo or in vitro assay. The compound identified by the method is an IR agonist or 
an IR antagonist. In one variation, the fitted quaternary structure of IR comprises 
substantially the entire fitted quaternary structure of IR. 

The method may further include: 

a) introducing into a computer program information defining a ligand binding site 
conformation including at least one residue from monomer A in Table I and at 
least one residue from monomer B in Table I, the ligand binding site defined by 
the approximate amino acid distances listed in Table I, wherein the program 
displays the quaternary structure thereof; 

b) comparing the structural coordinates of the compound to the structural coordinates 
of the ligand binding site and determining whether the compound fits spatially 
into the ligand binding site and is capable of changing IR from an inactive 
conformation to an active conformation or biasing IR toward an active 
conformation; 



wherein the ability to change IR from an inactive conformation to an active 
conformation or bias IR toward an active conformation is predictive of the ability 
of the compound to agonize IR activity. 

The method may further include preparing the compound that fits spatially into the ligand 
binding site and determining whether the compound agonizes IR activity in an IR activity 
assay. The invention also includes a method of identifying a compound which agonizes 
IR or a fragment or derivative thereof having IR activity, the IR, fragment or derivative 
including a ligand binding site with at least one of the residues and approximate structural 
coordinates of each of monomer A and monomer B listed in Table 1, the method 
including the steps of: 

a) providing the coordinates of the ligand binding site of the IR to a computerized 
modeling system; 

b) identifying compounds which interact with the ligand binding site and change IR 
from an inactive conformation to an active conformation or bias IR toward an 
active conformation. 

The invention also includes a method of drug design including using at least one 
of the amino acids of each of monomer A and monomer B of IR in Table 1 to determine 
whether a compound interacts with the ligand binding site of IR or a fragment or 
derivative thereof having IR activity and is capable of changing IR from an inactive 
conformation to an active conformation or biasing IR toward an active conformation. 

Another aspect of the invention includes a method of agonizing IR including 
administering to a mammal a compound that fits spatially into the ligand binding site of 
IR, the compound interacting with at least 

a) one IR amino acid in monomer A listed in Table 1 ; and 

b) one IR amino acid in monomer B listed in Table 1 ; 



wherein the compound is capable of changing IR from an inactive conformation 
to an active conformation or biasing IR toward an active conformation. 



The method may further include: 

a) introducing into a computer program information defining a ligand binding site 
conformation including at least one residue from monomer A in Table I and at 
least one residue from monomer B in Table I, the ligand binding site defined by 
the approximate amino acid coordinates listed in Table I, wherein the program 
displays the quaternary structure thereof; 

b) comparing the structural coordinates of the compound to the structural coordinates 
of the ligand binding site and determining whether the compound fits spatially 
into the ligand binding site and is capable of changing IR from an active 
conformation to an inactive conformation or biasing IR toward an inactive 
conformation; 

wherein the ability to change £R from an active conformation to an inactive 
conformation or bias IR toward an active conformation is predictive of the ability 
of the compound to antagonize IR activity. 

The method may include preparing the compound that fits spatially into the ligand 
binding site and determining whether the test compound antagonizes IR activity in an IR 
activity assay. 

Another aspect of the invention includes a method of identifying a compound 
which antagonizes IR or a fragment or derivative thereof having IR activity, the IR, 
fragment or derivative including a ligand binding site with at least one of the residues and 
approximate distances of each of monomer A and monomer B listed in Table I, the 
method including the steps of: 



a) providing the coordinates of the ligand binding site of the IR to a computerized 
modeling system; 

b) identifying compounds which interact with the ligand binding site and change IR 
from an active conformation to an inactive conformation or bias IR toward an 
inactive conformation. 

A variation of the invention includes a method of drug design including using at 
least one of the structural coordinates from each of monomer A and monomer B of IR in 
Table 1 to determine whether a compound interacts with the ligand binding site of IR or a 
fragment or derivative thereof having IR activity and is capable of changing IR from an 
active conformation to an inactive conformation or biasing IR toward an active 
conformation. 

The invention also includes a method of antagonizing IR by administering to a mammal a 
compound that fits spatially into the ligand binding site of IR, the compound interacting 
with at least: 

a) one IR amino acid in monomer A listed in Table 1 ; and 

b) one IR amino acid in monomer B listed in Table 1 ; 

wherein the compound is capable of changing IR from an active conformation to an 
inactive conformation or biasing IR toward an active conformation. In a variation of the 
method, the ability of the compound to fit spatially into the ligand binding site is 
determined by comparing the structural coordinates of the compound with the structural 
coordinates of IR. The ability of the compound to change the conformation of IR can be 
determined by comparing the structural coordinates of the compound with the structural 
coordinates oflR. 

Another variation of the invention includes: 
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a) introducing into a computer program information defining a cam including at least 
one residue from the Cam-loop segment in Table 2 and at least one residue from 
the LI surface in Table 2, wherein the program displays the structure thereof and 
its relation to other IR domains; 

b) comparing the structural coordinates of the compound to the structural coordinates 
of the cam and determining whether the compound interacts with the cam and is 
capable of changing IR from an inactive conformation to an active conformation 
or biasing IR toward an active conformation; 

wherein the ability to change IR from an inactive conformation to an active conformation 
is predictive of the ability of the compound to agonize IR activity. The method can 
further include preparing the compound that interacts with the cam and determining 
whether the test compound agonizes IR activity in an IR activity assay. The invention 
includes a method of identifying a compound which agonizes IR or a fragment or 
derivative thereof having IR activity, the IR, fragment or derivative including a cam with 
at least one of the residues and approximate structural coordinates of the cam-loop 
segment and the LI surface listed in Table 2, the method including the steps of: 

a) providing the coordinates of the cam to a computerized modeling system; 

b) determining compounds which interact with the cam and change IR from an 
inactive conformation to an active conformation or bias IR toward an active 
conformation. 

The invention includes a method of drug design including using at least one of the 
structural coordinates from each of cam-loop segment and the LI surface listed in Table 2 
to determine whether a compound interacts with the cam of IR or a fragment or derivative 
thereof having IR activity and is capable of changing IR from an inactive conformation to 
an active conformation or biasing IR toward an active conformation. A variation of the 
method of agonizing IR includes administering to a mammal a compound that fits 
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spatially into the cam of IR, the compound interacting with at least one of the residues 
and approximate structural coordinates of the cam-loop segment and the LI surface listed 
in Table 2; wherein the compound is capable of changing IR from an inactive 
conformation to an active conformation or biasing IR toward an active conformation. 

The method can further include: 

a) introducing into a computer program information defining a cam conformation 
including at least one residue from the Cam-loop segment in Table 2 and at least 
one residue from the LI surface in Table 2, wherein the program displays the 
structure thereof and its relation to other IR domains; 

b) comparing the structural coordinates of the compound to the structural coordinates 
of the cam and determining whether the compound interacts with the cam and is 
capable of changing IR from an active conformation to an inactive conformation; 

wherein the ability to change IR from an active conformation to an inactive conformation 
is predictive of the ability of the compound to antagonize IR activity. The method can 
additionally include preparing the compound that interacts with the cam and determining 
whether the test compound antagonizes IR activity in an IR activity assay. 

The invention also includes a method of identifying a compound which 
antagonizes IR or a fragment or derivative thereof having IR activity, the IR, fragment or 
derivative including a cam with at least one of the residues and approximate structural 
coordinates of the cam-loop segment and the LI surface listed in Table 2, the method 
including the steps of: 

a) providing the coordinates of the cam to a computerized modeling system; 

b) identifying compounds which interact with the cam and change IR from an active 
conformation to an inactive conformation or bias IR toward an active 
conformation. 
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Another variation of the invention includes a method of producing an IR modulator 
including using at least one of the structural coordinates from each of cam-loop segment 
and the LI surface listed in Table 2 to determine whether a compound interacts with the 
cam of IR or a fragment of IR or derivative thereof having IR activity and is capable of 
changing IR from an active conformation to an inactive conformation or biasing IR 
toward an active conformation. 

The method of antagonizing IR can include administering to a mammal a 
compound that interacts with the cam of IR, the compound interacting with at least one of 
the residues and approximate structural coordinates of the cam-loop segment and the LI 
surface listed in Table 2; wherein the compound is capable of changing IR from an active 
conformation to an inactive conformation or biasing IR toward an active conformation. 
The ability of the compound to interact with the cam can be determined by comparing the 
structural coordinates of the compound with the structural coordinates of IR. In the 
method of the invention, wherein the ability of the compound to change the conformation 
of IR can be determined by comparing the structural coordinates of the compound with 
the structural coordinates of IR. 

The methods of the invention may use free IR or IR bound to insulin in an 
IR:insulin complex. 

Another aspect of the invention includes a computer medium having recorded 
thereon data of an IR receptor, said data sufficient to model all or part of the quaternary 
structure of the receptor. The data can comprise structural coordinates of an IR receptor, 
the coordinates sufficient to model all or part of the quaternary structure of the receptor 
The quaternary structure of the receptor can include substantially all of the quaternary 
structure of the receptor. 

The invention also includes an insulin analog or other analog or mimetic 
identified by the methods of the invention. 
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The invention also includes a method of identifying agonists of IR by rational 
drug design including: producing an agonist for IR that will interact with amino acids in 
the IR ligand binding site or IR cam based upon the structure coordinates of the 
IRrinsulin complex. The method of may further include synthesizing the agonist and 
determining whether the agonist agonizes the activity of IR in an in vivo or an in vitro 
assay. In a method of the invention, the co-ordinates of the IR:insulin complex can be 
obtained from an IR: insulin complex prepared for EM. The co-ordinates of the 
IR:insulin complex may be obtained by means of computational analysis. 

The agonist can be designed to interact with at least one amino acid in monomer 
A in Table 1 and at least one amino acid in monomer B in Table 1 and cause IR to change 
from an inactive conformation to an active conformation or bias IR toward an active 
conformation. 

The method of identifying a compound that modulates insulin receptor and insulin 
interactions or activity, can include: 

a) designing a compound for modulating insulin receptor activity based upon the 
fitted quaternary structure of insulin receptor bound to insulin. 

The method can further synthesizing the compound and determining whether the 
compound modulates the interactions or activity of the insulin receptor and insulin. 

Another aspect of the invention includes a method of identifying a compound that 
modulates insulin receptor and insulin interactions or activity, including: 

a) comparing a compound for modulating insulin receptor activity to the quaternary 
structure of insulin receptor bound to insulin to determine whether the compound 
is likely to modulate insulin receptor and insulin interactions or activity; 

b) determining whether the potential compound modulates the interactions or 
activity of the insulin receptor and insulin. 
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The compound may agonize or antagonize insulin receptor and insulin interactions or 
activity The method of identifying how a compound interacts with IR activity may include 
comparingthe compound to all or part of the fitted quaternary structures of IR. 

Another aspect of the invention includes a computer readable medium including all 
or part of the fitted quaternary structure of IR as shown in a figure or described in the 
application. 

Another aspect relates to an insulin analog identified by a method of the invention. 
The invention includes a method of agonizing insulin receptor inlcuding administering a 
an effective amount of the analog. The invention also includes a method of medical 
treatment of diabetes or hyperglycemia including administering to a mammal having 
diabetes or hyperglycemia a pharmaceutical composition including an effective amount 
of the analog. Mimetics or other insulin variants may also be used. 

Brief Descriptjonof the Drawings 

FIG. l.de^ribesmereceptor-bindangassayofNanogold-insulin. Receptor-binding 
activity of purified Nanogold-insulin was compared to that of bovine insulin in a receptor- 
binding assay using human insulin receptor as described (9). Inset shows the mass 
spectrum obtained from the MOLDI-TOF analysis of purified Nanogold-insulin(7). 

FIG. 2 A and B show STEM dark field images of human insulin receptor 
/Nanogold-insulin(HnVNG-BI)complex. Specifically, FIG. 2A includes raw images 
showing several complexes. Arrowheads point to intense signals from Nanogold marker. 
Scale bar = 20 nm. FIG. 2B depicts MR/NG-BI images extracted from image fields, after 
low pass filtering to 1 .0 nm and boundary determination (left column). High density 
threshold representationof extracted images showing one (top five images) or two (bottom 
two images) sites of Nanogold location (right column). 
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FIG. 3 A and B depict the three-dimensional reconstruction of the HIR/NG-BI 
complex from 704 STEM dark field images. FIG. 3 A shows the density threshold 
representing the total expected volume for the complex [1]; intermediate density threshold, 
unsymmetrized, showing higher contiguous densities [2]; high density threshold of [2] 
5 showing only the Nanogold label [3]. Circles in the panels indicate location of the gold 
marker within the reconstructions. The resolution was 20 A as measured by Fourier phase 
residual analysis oftworeconstructionswith 352 images each (13). FIG.3B depicts the 
reconstruction with two-fold symmetry at intermediate density thresholds in different 
orientations, indicating the relationship and connectivity of the structural domains. Labels, 
10 for only one ap monomer of the dimeric HIR, refer to biochemical domains. Arrowhead 
indicatesthe proposed plane of the cell membrane lipid bilayer. LI, C-R, L2 = Ll- 
Cysteine-rich-L2domains; CD = connecting domain; Fnl, Fn2 = fibronectinlll repeats 1 
5 and 2:TK = tyrosine kinase; TM = transmembrane domain. 

fl* FIG. 4 A and B include the fitting of biochemical domains and their known x-ray 

f* 

H 15 structures to the 3D reconstruction. FIG. 4A is a schematic domain structure for one a£ 

id 

^ monomer, derived from i) connectivity of the 3D reconstruction at intermediate density 



threshold (FIG. 3), ii) from the primary domain sequence, Hi) from the requirement for two 
disulfides on the two-fold symmetry axis between the two a subunits (4), iv) the fit of the 
in known domain structures, and v) the principle of keeping domains of unknown structure as 

% 20 compact as possible. Distances measured in the 3D reconstructionbetween locations of 
subdomains CD, Fnl and the symmetrical disulfides were commensurate with numbers of 
intervening amino acid residues (structures not shown to scale; a subunit shown in red, P 
subunit in blue and green, unknown structures are spheres or lines): A = TK activation loop; 
1 = Cys524; 2 = Cys682, 683, 685; 3 = alpha-beta disulfide between Cys647 and Cys872; 
25 arrowhead = proreceptor cleavage site; other labels as described in FIG. 3B. FIG. 4B is a 
representativefitting of LI -Cys-rich-L2 domains as approximate cylinders to ectodomain 
structure of 3D reconstruction^. FIG. 3B, side view, 0; for ribbon structure see FIG 7A). 
One insulin molecule (purple ribbon, PDB : 1 BEN) inserted with its receptor-binding 
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domaincontactingtheLl-Cys-richdomainsofone subunit (fuchsia) and the L2 domain of 
theother(red). The Nanogold marker (yellow) on Phel ofinsulinB chain positioned to 
coincide with the high-density site of reconstruction. C) Right angle side view of (B) (cf. 
FIG. 3B, side view 90 )with Ll-Cys-rich-L2 domains (insulin partly hidden), fitted TK 
structure (green ribbon, PDB : 1 IRK) and two dimericFiirastmctures (blue and red ribbons, 
PDB: lmFn). Red ribbons represent the portion of Fnl derived from subunit. Activation 
loop (black ribbon) of left TK domain is shown in its crystaUographicposition. A-loop 
(dark blue) of symmetry-relatedright TK domain extended to overlap peptide substrate 
positionof opposite TKinpeptide-boundstate(4). Seealso(D). D) Right angle top view 
of (B) (cf. FIG. 3B, top view) showing the positions of the FnlH domains (top and bottom, 
blue/red)andthe TK domains (green). CrystaUographicposition of activation loop (black) 
is uppermost within one TK domain, while extended activation loop (dark blue) of the other 
TK domain is below. One square in the wire mesh is 6.5 A. 

FIG. 5 A, B, and C are the three-dimensional structure of the human insulin receptor 
reconstructed images of the purified dimeric insulin receptor complexed with insulin 
obtained via low dose scaniungtauisimssionc^^^ 

of total volume to show contiguity of structure. Maximum diameter is 150 A. Various 
regions of one ap monomer of the dimeric structure labelled as determined from insulin 
location, connectivity, mass distribution and fitting of known subdomain structures, 
(i), Viewas seenfromthe exterior of the cell, down the two-fold symmetry axis of the 
(ap) 2 heterodimer. Partially transparent gray disc represents cell membrane with fainter 
regions of structure on distal side of membrane. (iQ, View at right angles to A with 
extracellular components above gray translucent symbolic cell membrane, (ffi), View from 
mteriorof <*11 wim famter stm^ 

Arrow head points to cam-like feature (see text). For domain abbreviations see FIG. 6. 
FIG. 5B is the simplified, stylized model of insulin-IRin the same orientations as FIG. 5A. 
(i), View from exterior of cell, (ii), Side view (cell membrane edge-on), (in), View from 
interior of cell. Corresponding subdomains for one cc0 monomer (fuchsia) are indicated. 
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The other ap monomer (blue) is symmetrically related. Stylized catalytic regions and 
activation loops (gold and black spheres and hairpins) axe indicated on TK domains. The 
two cc-a disulphide bonds (1, 2, yellow) modelled on two-fold axis in strained 
configuration. Cams (arrow head, black and gold discs) in position permissive for 

5 transactivation. Insulin ligand represented as red disc. For domain abbreviations see FIG. 6. 
FIG. 5C is the stylized model of IR in the absence of insulin. Same orientations as FIG. 5B. 
(i), View from exterior of cell, with separated Ll-Cys-rich domains, (ii), Side view (cell 
membrane edge-on), (iii), View from interior of cell, with separated TK domains. 
Activation loops (arrow) do not reach catalytic loops (spheres on TKs). Cams (arrow head, 

10 blackand gold discs) in position to block mutual approach of Fn2/TM/TK assemblies. Pair 
of Cys-Cys bonds (1, 2, yellow) in relaxed equilibrium positions. Insulin (red disc) in 
position to bind to one ap monomer. For domain abbreviations see FIG. 6. 
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FIG. 6 depicts the sequential spatial arrangement of the subdomains of one ap 
monomer of the insulin receptor deduced from the 3D structure [1]. The N-terminal of the 
a subunit (red) is at the top, the C-terminal of the p subunit (blue) near the bottom. The 
domains and their delimiting amino acid sequences [5] are: aN-terminal - 1 - LI - 
158/159 - cysteine-rich (CR) - 310/311 - L2 - 470/471 - connecting- 
domain/aFibronectinO (CD/FnO) -572/573 - aFibronectinl (aFnl) - 661/662 - a-insert- 
domain (ID). 719 - ctC-terminal; pN-tenninal - 724 - P-ID - 779/780 - pFnl - 816/817 - 
PFn2 -913/914 -juxtamembrane - 929/930 - transmembrane (TM) - 952/953 - 
juxtamembrane - 977/978 - tyrosine-kinase (TK) - 1283/1284 - C-terminal region - 1388 - 
pC-terminal. Other important residues are Cys524 (denoted by "1 M ), which forms an a-a 
bond on the two-fold symmetry axis, as does one of Cys682, Cys683 or Cys685 (shown 
as "2") . An a-P bond is formed by Cys647 in Fnl of the a subunit and Cys872 in Fn2 of 
the p subunit (shown as "3"). "x" marks the cleavage site between the a and p subunits in 
the pro-receptor. The catalytic loop and the activation loop (shown as "A-C"; residues 
1 130-37 and 1 149-70, respectively) are approximately in the central region of the 
tyrosine kinase structure [10,1 1]. 

FIG. 7 A, B, and C include the side view of IR dimer structure at volume 
corresponding to total receptor mass, in wire mesh representation rotated 90° with respect 
to 5a(ii), fitted centrally with two L1-CR-L2 regions of IR as adapted from the co- 
ordinates of the corresponding IGF-1R structure. Aminoacid backbone representation. 
One L1-CR-L2 monomer region is shown in red, the other in fuchsia. The diamond- 
shaped opening is the modelled insulin binding site with one Nanogold-insulin fitted into 
the site (see FIG. 8). FIG. 7 B depicts the end view of full-mass representation of IR 
dimer. Left half: surface rendering; right half: wire mesh representation. Fitted structure of 
two Dt-adaptedLl-CR-L2 regions (red and fuchsia). Arrow: cam-like region on CR 
domain. FIG. 7C includes higher density solid surface representation slightly rotated of 
view in FIG. 7B showing location of CR cam regions of atomic structure against Fn2 
domains of 3D reconstruction. 
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FIG. 8 A, B, and C, and specifically, FIG. 8A are views in parallel stereo 
representation of IR insulin-binding region of docked L1-CR-L2 regions (red and fuchsia; 
cf. FIG. 7a) fitted with insulin (brown). FIG. 8B is the backbone representation except 
for aminoacid sidechains tabulated in Table 1 . See text for details. Insulin contacts with 
5 one L1-CR-L2 monomer (fuchsia). Slight rotation from FIG. 8 A. The gold sphere 

represents the Nanogold label on insulin used in the 3D reconstruction. FIG. 8C depicts 
insulin contacts with second LI -CR-L2 monomer (red). 

FIG. 9 A and B depict simplified schematics of structural changes during 
activation of insulin receptor. FIG. 9 A represents the inhibitory state. Ectodomain of 
10 dimeric a subunits each with two differing insulin binding sites and blocking cam. 
Unbound bivalent insulin. P subunits resting against cams, crossing membrane, with 
0 tyrosine kinase (TK) domains separated. Arrows indicate thermally induced motion. FIG. 

> 9B represents the insulin bound state. Blocking cams rotated, p subunits resting against 

P centre of ectodomain. TK domains juxtaposed for transphosphorylation. 

ifi 15 FIG. 10 A and B. FIG. 10A includes views (parallel stereo) of fibronectin 

¥ domains docked into ectodomain quatemery structure of IR. FnO and alD regions are 

modelled as extending around L2 to the central 2-fold symmetry axis to form cc-ct 
H disulphide bonds. The oc-P disulphide is shown between aFnl and Fn2. The domains of 

5* one aB monomer only are coloured and labelled for identification. For clarity, LCL is 

^ 20 shown only with part of the CR domain and all of the L2 domain (amino acids 250 to 
470). FIG. 1 0B is the complete fit of known IR and IR-like domains as docked into 3D 
EM reconstruction of quaternary structure of IRdimer. The TM and juxtamembrane 
domains, of unknown structure, have been modelled as helix and loop structures and 
arbitrarily placed to connect the Fn and TK domains. The unknown structures of the pID 
25 region at the N-terminal of the pFnl domain and the C-terminal p-domain joined to the TK 
domains have not been modelled. 
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FIG. 1 1 is the sequence of human insulin. 



FIG. 12 is the sequence of human insulin receptor. 
FIG. 13 is an apparatus for molecular modeling. 

Detailed Description of the Invention 

The invention includes new 3D structures for dimeric two state-receptors that are 
activated or inhibited by ligand binding. It also includes aspects such as the ligand binding 
site, binding domains, other functional or structural domains and the mechanism of action 
of the receptors. The invention also includes methods of using these aspects to identify 
compounds capable of modulating (agonizing or antagonizing) the receptors. 

In one embodiment, the receptor is the insulin receptor (amino acid sequence is 
shown in FIG. 12). In a preferred embodiment the structure is the fitted quaternary 
structure of IR. The "fitted quaternary structure" of IR includes the structure of the IR 
domains fitted together to arrive at a three-dimensional arrangement that fits into the 
corresponding portion of the quaternary structure of IR. Parts of the fitted quaternary 
structure are also useful in the methods of the invention. Prior to this invention, the 3D 
structure of the receptor and its mechanism of activity were unknown. The relative 
positions of amino acids which bound insulin and provided receptor activity were also 
poorly understood. The invention details the atomic interactions of insulin with the dimeric 
insulin receptor (IR) in the extracellular insulin binding site of the receptor. Furthermore, a 
mechanism is detailed which shows how this binding of insulin results in transmembrane 
signalling to activate the intracellular intrinsic tyrosine kinase of the insulin receptor dimer. 
The structure and mechanism explain the normal function of the insulin receptor as well as 
the effect of mutations and of altered physiological conditions. The invention provides the 
first comprehensive description of insulin binding to insulin receptor and the mechanical 
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mechanism of insulin receptor activity. The structure of IR has been determined while 
complexed to Iigands (including insulin) and has been modeled in the insulin-free state. 

The invention includes the structure of insulin receptor fitted with the atomic 
coordinates of the amino acids comprising the receptor, the use of that structure to solve 
the structure of insulin receptor isoforms, homologues and other forms of insulin 
receptor, mutants and co-complexes of insulin receptor, and the use of the insulin 
receptor structure and that of its isoforms, homologues, mutants, and co-complexes to 
design modulators. The structure is particularly useful for development of ingestible 
(preferably oral) insulin mimicking agents (analogs, mimetics) that can be used in place 
of insulin (which has to be administered by injection) to treat insulin-dependent diabetes. 

In one aspect the present invention is directed to the three-dimensional structure 
of an isolated and purified IR polypeptide and its structure coordinates. Another aspect 
of the invention is to use the structure coordinates of the insulin receptor to reveal the 
atomic details of the ligand binding site and one or more of the accessory binding sites of 
insulin receptor such as a cam. The entire receptor may be used or particular regions of 
interest may be used. Structural and conformational changes induced in the receptor may 
also be studied. Another aspect of the invention is to use the structure coordinates of an 
insulin receptor to solve the structure of a different insulin receptor or a mutant, 
homologue or co-complex of insulin receptor. A further aspect of the invention is to 
provide insulin receptor mutants characterized by one or more different properties 
compared to wild-type insulin receptor. Another aspect of this invention is to use the 
structure coordinates and atomic details of insulin receptors or mutants or homologues or 
co-complexes thereof to design, evaluate (preferably computationally), synthesize and use 
modulators of insulin receptor that prevent or treat the undesirable pathologies of 
inadequately or improperly functioning insulin receptor. 

The IR structure of the present invention includes the three dimensional structure 
of the receptor including the fitted quaternary structure. The IR structure includes the 
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ligand binding site that includes the amino acid residues listed in Table 1 and the 
structures including the amino acid residues in Table 2. 
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This invention also provides the first rational drug design strategy for modulating 
IR activity. It includes methods for identifying compounds that can interact with insulin 
receptor. The method for identifying insulin mimetics and insulin antagonists preferably 
include fitting the crystal structures, NMR structures and other structures of insulin receptor 
domains into the quaternary structure of the complete insulin-bounddimeric insulin 
receptordetermmedfix)melectronmicroscopicmiagereconstmcuon. These interactions 
can be easily identified by comparing the structural, chemical and spatial characteristics 
of a test compound to the three dimensional structure of the insulin receptor. Since the 
amino acids that are responsible for receptor activity and binding were identified by this 



*r invention, drug design may be done on a rational basis. Structures such i 



% ' — ° <="- - — j — a lauuuui oasis. airuciures sucn as a cam or a 

? Hgand bindin S site may be studied together or separately. Fragments of a cam or a ligand 

£ binding site may also be studied (e.g. at least one or at least 2 of the amino acids in table 

M 15 

9 The structure serves as a detailed basis for the design and testing of insulin analogs, 

|| ^ eticsmdin ^ antagonists, in^^ 



1 or 2, optionally also including one or more proximate amino acids). 
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and in vivo, providing a method for identifying modulators (antagonists and agonists) 
having specific contacts with the insulin receptor or an isoform, homologue, mutant or co- 
complex. The effect of a modification to insulin may be readily viewed on a computer, 
without the need to synthesize the compound and assay it in vitro. As well, non-protein 
organic molecules may also be compared to the insulin receptor on a computer. One can 
readily determine if the molecules have suitable structural and chemical characteristics to 
interact with, and activate or inhibit, receptor activity. The invention includes the IR 
modulators discovered using all or part of an IR structure of the invention (preferably the 
fitted quaternary structure) and the methods of the invention. 
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Drug design 

The determination of the quaternary structure of IR, and in particular its fitted 
quaternary structure, provides a basis for the design of new and specific compounds for 
the diagnosis and/or treatment of IR-related pathologies ("pathology" includes a disease, 
5 a disorder and/or an abnormal physical state preferably characterized by either (i) 
inadequate or excessive insulin in a mammal (preferably a human) or inadequate or 
excessive IR activity. IR related pathologies include those involving IR as in FIG. 12 or 
IR variants described in this application.). This structure is useful in the design of 
modulators (agonists or antagonists), which may be used as therapeutic or prophylactic 
1 0 compounds for treating pathologies in which upregulation or downregulation of receptor 
activity is beneficial. It will be apparent that methods using IR described below may be 
Q readily adapted for use with a fragment of IR or an IR variant. 



p The characterization of the novel IR ligand binding site and cams permit the 

f\ design of potent, highly selective IR modulators. Several approaches can be taken fc 

)S 15 use of the IR structure in the rational design of ligands of IR. A computer-assisted, 



manual examination of a ligand binding site or cam structure may be done. 



This invention includes the methods for identifying modulators of IR that act on 
|H the IR quaternary structure (preferably the fitted quaternary structure), ligand binding site 

;0 and/or cam, as well as the modulators themselves. The agonist modulators upregulate IR 

20 activity by biasing IR towards its active, closed conformatiorL The antagonist modulators 
downregulate IR activity by biasing IR towards its inactive, open conformation. Such 
modulators may bind to all or a portion of the ligand binding site of IR. They may also 
modulate ER activity by interacting with other portions of IR, such as the cam structures. 
They may be competitive or non-competitive modulators. Once identified and screened 
25 for biological activity, these modulators may be used therapeutically or prophylactically 
to affect IR activity. 
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The invention also includes methods of agonizing or antagonizing insulin receptor 
by administering compounds with structural and chemical properties that allow the 
compounds to interact with insulin receptor residues in order to modulate receptor 
activity. 

Interaction of modulators of IR ligand binding site 

A test compound that is a modulator interacts with at least one insulin receptor 
residue listed in Table 1 on monomer A and at least one residue in Table 1 on monomer 
B in order to activate or inhibit insulin receptor. "Interact" refers to binding to the 
receptor which is capable of modulating its activity. Receptor fragments may be used in 
the methods of the invention to predict how the full receptor will react to a modulator. 
Since the IR is a 2-fold symmetric dimer structure, either one of the IR monomers can 
represent monomer A, the other representing monomer B. A modulator that is an agonist 
is capable of changing the IR from an inactive conformation to an active conformation. 
A modulator that is an antagonist is capable of changing the IR from an active 
conformation to an inactive conformation (or may keep or maintain IR in its inactive 
conformation). A modulator may bias the receptor towards a particular conformation 
instead of (or in addition to) changing the conformation. 

The compound may also interact with at least: two, three, or four or five of the 
residues on each of monomer A or monomer B that are listed in Table 1 . The test 
compound may interact with at least about: five, six, seven or eight, nine, ten, eleven or 
twelve amino acid residues on monomer B. The intersidechain distances between the 
modulator and the IR are preferably about those distances (or at least one of the distances) 
listed in Table 1 . The distances may be varied by plus or minus about: 0.1 A, 0.2A, 
0.25A, 0.3A, 0.4A, 0.5A, 0.6A, 0.7A, 0.75A, 0.8A, 0.9A, 1 A or >1A, >L5A or 2A as 
long as the test compound is still able to interact with IR and modulate its activity. It is 
apparent that the test compound must be able to make appropriate interactions with the IR 
ligand binding site if it is to activate the IR. 
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Table 1 - Modeled Approaches between Insulin Side Chains and Insulin Receptor Side 
Chains 



Insulin Residue Insulin Receptor Residue (Region! 
Monomer A 



Intersidechain Interaction 
Distance ( A) 



GluA4J Arg86 (LI) 2.5 t electrostatic 

ThrA8 2.6 polar 



GluA17 


Arg331 


(L2) 


GlnA21 


Ser323 


(L2) 


LysB29 


Aspl2 


(LI) 




Asn34 






MonomerB 




SerB9 


Gln34 


(LI) 


HisBlO 


Argl4 




GluB13 


Arg86 




ValB12 


Phe89 


(LI) 


LeuB17 




TyrB16 


Leu87 




PheB24 


Phe88 




PheB25 






TyrB26 


Tyr91 




GluB21 


His247 


(CR) 




Asn249 




ArgB22 


Glu250 






Glu287 




ArgA22 


Glu287 


(L2) 




His247 




AsnA5 


Arg331 


(L2) 



AsnAlS 



2i electrostatic 

53* H-bond ladder 

2.6 electrostatic 

2.5 polar 

2.8 Hbond 

5.0* electrostatic (H20 bridge) 

2.5 electrostatic 

2.5 hydrophobic patch 

2.5 hydrophobic patch 

2.5 hydrophobic patch 

2.5 hydrophobic patch 
hydrophobic patch 
hydrophobic patch 

2.5 electrostatic 

2.5 polar 

4.0* electrostatic 

25 electrostatic 

2.5 electrostatic 

2.5 electrostatic/polar 

2.5 polar 

2.5 polar 



% Potential vicinal interactions are grouped 

f Minimum distance of approach modelled at 2.5 A 

* Closest approach; interaction would require a water molecule, hydrogen bond chain or a 
rotation of the entire L2 region 
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Individual amino acids in insulin that are important in binding to the receptor 
include: Al, A4, A5, Al 9, A21, B12, B16, B17, B24, B25 and B26. On the insulin 
receptor amino acids that are involved in insulin binding include: 12, 14, 15, 34, 36, 39, 64, 
86 89, 90, 91 , 243-251, 323 and 707-716. Only amino acids 707-716 are not in the Ll-CR- 
L2 domains. All others are either in the walls lining the ligand binding site tunnel or are at 
the entrance of the ligand binding site. 

Interaction of modulators of IR cam 

The invention also provides alternative and new methods to modulate IR activity. 
For example, the 3D structure shows that IR has two "cams" that change the 
conformation of the IR from an inactive conformation to an active conformation. The 
existence of these cams was unknown prior to this invention. Modulators such as organic 
molecules (protein or non-protein) may block or activate cam movements in order to 
modulate the IR toward an inactive state or to an active state. 

A modulator interacts with at least one insulin receptor residue listed in Table 2 
on the Cam-loop segment of the Cys-rich region and at least one residue in Table 2 on the 
LI surface proximate the cam-loop segment in order to activate or inhibit insulin 
receptor. The modulator is capable of changing the IR from an inactive conformation to 
an active conformation and/or biasing IR towards an active or inactive conformation. 

The compound may also interact with at least: two, three, four, five or six (or 
seven, eight, nine, ten, eleven or twelve) of the residues listed in Table 2 on each of the 
Cam-loop segment of the Cys-rich region and the LI surface proximate the cam-loop 
segment. The intersidechain distances between the test compound and the IR may be 
varied by plus or minus about: 0.1 A, 0.2A, 0.25A, 0.3A, 0.4A, 0.5A, 0.6A, 0.7A, 0.75A, 
0.8A, 0.9A, 1 A or >1 A, >1.5A or 2A as long as the test compound is still able to interact 
with IR and modulate its activity. It is apparent that the modulator must be able to make 
appropriate interactions with the IR cam if it is to activate or inactivate the IR. 
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As depicted in Table 2, charged and polar amino acids in the region of the cam- 
loop can bind a modulator to the receptor, to allow specificity of binding, and to move or 
block the Cam-loop segment 

All specific interactions with the amino acids below would be electrostatic (ionic) 
except with Gin (glutamine) and Asn (asparagine) which are polar. 



Table 2 



Cam-loop segment of Cys-rich region 


LI surface near cam- 


oop segment 


Lys265 


electrostatic 


Glul NH 3 + 


electrostatic 


Lys267 


electrostatic 


AsnlS 


polar 


Asn268 


polar 


Asn 16 


polar 


Arg270 


electrostatic 


Argl9 


electrostatic 


Arg272 


electrostatic 


Glu22 


electrostatic 


Glu273 


electrostatic 


Glu24 


electrostatic 






Asn25 


polar 






Glu44 


polar 






Asp45 


electrostatic 






Arg47 


electrostatic 






Asp48 


electrostatic 






Lys53 


electrostatic 



The invention includes a method of agonizing or antagonizing IR activity by 
administering a modulator identified according to the methods of the invention. 

IR modulating compounds 

A diagnostic or therapeutic modulating compound of the present invention can be, 
but is not limited to, at least one selected from a nucleic acid, a compound, a protein, a 
lipid, a saccharide, an isotope, a carbohydrate, an imaging agent, a lipoprotein, a 
glycoprotein, an enzyme, a detectable probe, and antibody or fragment thereof; or any 
combination thereof. Diagnostic compounds (useful in diagnosis as a research tool in an 
assay) can be detectably labeled as for labeling antibodies. Such labels include, but are 
not limited to, enzymatic labels, radioisotope or radioactive compounds or elements, 
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fluorescent compounds or metals, chemiluminescent compounds and bioluminescent 
compounds. Other types of compounds may also be useful. 

The compound may include an amino acid sequence (including a peptide, a 
polypeptide or a protein) or an amino acid sequence derivative (ie an analog, prepared for 
5 example by substituting, deleting, modifying (eg. glycosylating) one or more amino acids 
- see, for example, US Patent Nos. 5,952,297, 5,922,675, 5,700,662, 5,693,609, 
5,64632, 5,149,777, 5,00,8241, 4,946,828 and 5,164,366. The analog may also be part 
of a human insulin analog complex, such as that in US 5,474,978.). 

The analog may be an insulin derivative, an insulin precursor derivative or a 
10 derivative of an already known insulin analog (See for example US Patent Nos. 

5,952,297, 5,922,675, 5,747,642, 5,716,927). One skilled in the art may analyze insulin, 
its precursors, and other known analogs to determine how they interact with IR and then 
prepare improved compounds. 

Those of skill in the art recognize that a variety of techniques are available for 
15 constructing derivatives with the same or similar desired biological activity insulin but 
with more favorable activity than the polypeptide with respect to route of administration, 
solubility, stability, and/or susceptibility to hydrolysis and proteolysis. See, for example, 
Morgan and Gainor,^4wi. Rep. Med Chem., 24:243-252 (1989). Examples of 
polypeptide derivatives are described in U.S. Patent Nos. 5,643,873. Other patents 
20 describing how to make and use derivatives include, for example, 5,786,322, 5,767,075, 
5,763,571, 5,753,226, 5,683,983, 5,677,280, 5,672,584, 5,668,110, 5,654,276, 5,643,873. 
Derivatives may be designed on computer by comparing compounds to the 3D structures 
disclosed in this application. Derivatives of insulin may also be made according to other 
techniques known in the art. For example, by treating a polypeptide of the invention with 
25 an agent that chemically alters a side group by converting a hydrogen group to another 
group such as a hydroxy or amino group. Derivatives can include sequences that are 
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either entirely made of amino acids or sequences that are hybrids including amino acids 
and modified amino acids or other organic molecules. 

The compound may also be a nonprotein organic molecule, such as a mimetic (ie 
a non-protein molecule which functionally mimics a peptide, polypeptide or a protein). 
For example, a mimetic may functionally mimic insulin by binding to IR and activating 
it. Such a mimetic may activate IR to a greater or lesser extent than that caused by 
insulin as long as the mimetic produces the end result of IR activation. Examples of 
mimetics are pyrrolidine compounds such as (2R,3R,4R)-3,4-dihydroxy-2- 
hydroxymethylpyrrolidine and other substituted 2-methylpyrrolidines (eg. US No. 
5,854,272) or hydroxy alkyl piperidine (eg. US No. 5,863,903). Small organic molecules 
may also be used to antagonize or agonize IR by interacting with a cam. 

A compound can have a therapeutic efifect on the target cells, the effect one of 
those known to be caused by modulation of IR. The therapeutic effects that modulates at 
least one IR in a cell can be provided by therapeutic agent delivered to a target cell via 
pharmaceutical administration (discussed below). 

Determining suitable types of modulators from IR structure 

One skilled in the art would recognize, in view of the fitted quaternary structure of 
IR, that the type of modulator used may be varied or customized according to the portion of 
IR targeted. For example, modulators may be simple peptides which take advantage of 
specific hy drophilic, hydrophobic, or charge interactions, or variously branched peptides 
with each branch differentially contributingto a particular interaction (such as the loligomer 
structuresof Gariepy and co-workers: PNAS USA 92, 2056-60, 1995; BioconjugateChem. 
10,745-54,1999). Modulators may be simpler chemicals with corresponding interaction 
sites, in or near the insulin binding contact sites of IR. Such agents may also be molecules 
that act external to the insulin binding site to effect activation or inhibitionby interacting 
with specific sites identified as important in the mechanism of transmembrane signal 
transduction. These include specific chemicals, peptides or monoclonal and polyclonal 
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antibodies or subantibody fragments such as the Fab, or Fv fragments. They include 
molecules that specifically remove or enhance the natural blockage on the insulin receptor 
to activation of its intrinsic tyrosine kinase. Such agents may also be molecules that 
enhance or inhibit transphosphorylationof the juxtaposing intrinsic pair of tyrosine kinase 
domains of the dimeric insulin receptor. 

Determining structure of IR, IR variants and other receptors 

1 . Complete IR Structure 

Techniques described in this application (such as those in references 4 and 5 or 
US 5,834,228) were used to identify and characterize regions of an insulin receptor such 
as the LI-Cys-rich-L2 domain. We characterize the entire insulin receptor and its ligand 
binding site using these techniques. The fitted quaternary structure of IR needed for drug 
design is disclosed in this application. We characterize the regions which are not fully 
characterized by x-ray crystallography or nuclear magnetic resonance spectroscopy of 
small domains that include these structures. They include the transmembrane and 
juxtamembrane domains, the beta insert and the beta C-terminal regions (see legend to 
FIG. 6). 

2. IR Variants and Other Receptors 

The IR data of the invention may be also used to solve the structure of IR variants 
(eg. mutants, homologs) or other dimeric receptors, or of the any other protein with 
significant amino acid sequence homology to any functional or structural domain of IR. 
We determine the structure of isoform A and B of IR as well as mutants. IR has two 
isoforms, A and B. Isoform A is shorter than isoform B by 12 amino acids which are 
coded by exon 1 1 of the IR gene (the twelve amino acids are from Lys71 8 to Arg 729 as 
follows: Lys-Thr-Ser-Ser^ly-Thr-Gly-Ala-Glu-Asp-Pro-Arg). Isoform A interacts with 
insulin and produces the same effect as isoform B, which is a metabolic effect. 
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The insulin receptor described in this application was extracted from human 
placenta. Insulin receptor from other sources, such as other tissues, cells or cDNA may 
also be modeled and used in the methods of the invention. The techniques described in 
this application to image the receptor may be used with insulin receptor from any human, 
mammalian or other tissue. Insulin receptor homologues and other forms of insulin 
receptor, mutants and co-complexes of insulin receptor may also be used. A fragment of 
the receptor may also be used. A fragment may be from about 25-50, about 50-100, 
about 100-250 or about 250-500, 500-1000 or at least about 1000 amino acids. 

The IR is similar to other dimeric receptors, such as IGFR and IRR. The 3D 
structure of IR may be used to determine the 3D structure of these receptors by 
identifying regions of homology (similarity between amino acid, secondary, tertiary or 
quaternary structure) between the receptors and determining the structure of the dimeric 
receptor. 

One useful method for this purpose is molecular replacement in crystallography. 
In this method, the unknown structure in a crystal, whether it is another form of IR, an IR 
mutant, or the structure of some other dimeric receptor with significant amino acid 
sequence homology to any functional domain of IR, may be determined using the IR 
structure coordinates of the IR dimer structure coordinates of this invention. This method 
will provide an accurate structural form for the unknown structure more quickly and 
efficiently. 

Computer based design 

The invention allows computational screening of molecule data bases for 
compounds that can bind in whole, or in part, to IR. The IR structure of the invention 
permits the design and identification of synthetic compounds and/or other molecules 
which have a shape complimentary of the conformation of the IR ligand binding site of 
the invention. Using known computer systems, the coordinates of the IR structure of the 
invention may be provided in machine readable form, the test compounds designed 
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and/or screened and their conformations superimposed on the complementary surface 
structures and surface characteristics of the receptor or of its binding site. Subsequently, 
suitable candidates identified as above may be screened for the desired activity, stability, 
and other characteristics. 

In this screening, the quality of fit of such entities or compounds to the binding 
site may be judged either by shape complementary (R.L DesJarlais et al. J. Med. Chem 
31 :72-729 (1988) or by estimated interaction energy (E.C. Meng et al, J. Comp. Chem. 
13:505-524 (1992)]. 

Thus, the IR structure permits the screening of known molecules and/or the 
designing of new molecules which bind to the IR structure, particularly at the ligand 
binding site or cams, via the use of computerized evaluation systems. For example, 
computer modeling systems are available in which the sequence of the IR, and the IR 
structure (i.e., atomic coordinates of IR and/or the atomic coordinate of the ligand binding 
site cavity, bond angles, dihedral angles, distances between amino acids in the ligand 
binding site region, etc. as provided by the fitted quaternary structure may be input. A 
machine readable medium may be encoded with data representing the coordinates of the 
entire IR structure. The computer then generates structural details of the site into which a 
test compound should bind, thereby enabling the determination of the complementary 
structural details of said test compound. 

The production of compounds that bind to or modulate IR generally two factors. 
First, the compound must be capable of physically and structurally associating with IR. 
Non-covalent molecular interactions important in the association of IR with its substrate 
include hydrogen bonding, ionic interactions van der Waals interactions and hydrophobic 
interactions. 

The invention permits the design of agents that bind to the three dimentional 
surfaces of IR by using the pattern on those surfaces of positive charges, negative 
charges, hydrophobic grouping of atoms, dipolar groups and hydrodren bonds that are 
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revealed in the structure of the surfaces and in the relative positioning of these surfaces 
with respect to each other in the quaternary structure. 



Those skilled in the art can create an agent that places the positions of chemical 
groups on the agent near matching atoms or groups of atoms on IR using well-known 
interactions such those as in Table3 . 

Table 3 



Second, the compound must be able to assume a conformation that allows it to 
associate with IR. The compound will preferably interact with the ligand binding site or 
a cam and bias or change IR towards either an active conformation or inactive 
conformation. Although certain portions of the compound will not directly participate in 
this association with IR those portions may still influence the overall conformation of the 
molecule. This, in turn, may have a significant impact on potency. Such conformational 
requirements include the overall three-dimensional structure and orientation of the 
chemical entity or compound in relation to all or a portion of the binding site, e.g., ligand 
binding site, accessory binding site, or cam of IR or the spacing between functional 
groups of a compound comprising several chemical entities that directly interact with IR. 

The potential modulating effect of a chemical compound with IR may be 
estimated prior to its actual synthesis and testing by the use of computer modeling 
techniques. If the structure of the compound shows insufficient interaction and 
association between it and IR the compound is not synthesized and tested. If computer 
modeling indicates a suitable interaction, the molecule may then be synthesized and 



- negative charge 

- hydrophobic group 

- polar group 

- hydrogen donor 

- hydrogen acceptor 



Characteristics of atoms or 
groups of atoms on IR 
- positive charge 



Matching characteristics of atoms or 
groups of atoms on the agent 

- negative charge 

- positive charge 

- hydrophobic group 

- polar group 

- hydrogen acceptor 

- hydrogen donor 



tested for its ability to bind to IR in an assay. Synthesis of ineffective and inoperative 
compounds can be avoided. 



Computer modeling may be combined with assay techniques. For example, one 
could probe the IR (or fragments thereof) with a variety of different molecules to 
5 determine optimal sites for interaction between candidate modulators and IR. Small 

molecules that bind tightly to IR sites can be designed and synthesized and tested for their 
IR modulatory activity. This information can be combined with computer modeling 
information. A modulating compound may be computationally evaluated. A modulating 
compound may be further designed by a series of steps in which compounds or fragments 
10 are screened and selected for their ability to associate with the individual binding amino 
acids, secondary, tertiary or quaternary structure or other areas of IR. 

p 

*8 One skilled in the art may use one of several methods to screen chemical entities 

(p, or fragments for their ability to interact with IR. This process may begin generating the 

ligand binding site on the computer screen based on the IR amino acids and distances 
}8 15 from the co-ordinates of the IR complex. Selected fragments or chemical entities are then 
? be positioned against IR. Docking may be accomplished using software such as Insight, 

2* 5 Quanta, and Sybyl, followed by energy minimization and molecular dynamics with 



standard molecular mechanics forcefields, such as CHARMM and AMBER. 



^ Specialized computer programs may also assist in the process of selecting 

20 fragmented or chemical entities. These include: 

MCSS (Molecular Simulations, Burlington, MA) [A. Miranker and M. Karpius. 
"Functionality Maps of Binding Sites: A Multiple Copy Simultaneous Search Method". 
Proteins: Structure, Function and Genetics, 1 1 :29-34 (1991)]. 

GRID (Oxford University, Oxford, UK) [P. J. Goodford, "A Computational 
25 Procedure for Determining Energetically Favorable Binding Sites on Biologically 
Important Macromolecules". J. Med. Chem. 28:849-857 (1985)]. 
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DOCK (University of California, San Francisco, CA) p.D. Kuntz et al, "A 
Geometric Approach to Macromolecule-Ligand Interactions", J. Mol. Biol. 161 : 269-288 
(1982)]. 

AUTOCOCK (Scripps Research Institute, La Jolla, CA) p.S. Goodsell and A J. 
Olsen, "Automated Docking of Substrates to Proteins by Simulated Annealing". 
Proteins: Structure, Function, and Genetics, 8:192-202 (1990)]. 

Additional commercially available computer databases for small molecular 
compounds include Cambridge Structural Database and Fine Chemical Database. For a 
review see Rusinko, A., Chem. Des., Auto. News 8.44-47 (1993). 

For example, software such as GRID (a program that determines probable 
interaction sites between probes with various functional group characteristics and the 
enzyme surface) analyzes the ligand binding site to determine structures of modulating 
compounds. The program calculates, with suitable activating or inhibiting groups on 
molecules (e.g. protonated primary amines as the probe) suitable conformations. The 
program also identifies potential hot spots around accessible positions at suitable energy 
contour levels. Suitable ligands, such as inhibiting or activating compounds or 
compositions, are then tested for modulating IR. 

Once suitable chemical entities or fragments have been selected, they can be 
assembled into a single compound. Assembly may be proceeded by visual inspection of 
the relationship of the fragments to each other on the three-dimensional image displayed 
on a computer screen in relation to the structure coordinates of IR. This would typically 
be followed by manual model building using software such as Quanta or Sybyl. 

Useful programs to aid one of skill in the art in connecting the individual 
chemical entities or fragments include: 
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3D Database systems such as MACCS-3D (MDL Information Systems, San 
Leandro, CA). See Y.C. Martin, **3D Database Searching in Drug Design", J.Med. Chem., 
35:2145-2154(1991). 

CAVEAT (University of California, Berkeley, CA) [P. A. Barlett et al. "CAVEAT: 
A program to Facilitate the Structure Derived design of Biologically Active Molecules," in 
Molecular Recognition in Chemical and Biological Problems. " Special Pub., Royal Chem. 
Soc. 78, pp 182-196(1989). 

HOOK (Molecular Simulations, Burlington, MA). Instead of proceeding to build 
IR modulator in a step-wise fashion one fragement or chemical entity at a time as 
described above, inhibitory or other type of binding compounds may be designed as 
whole or "cfe novo" using either an empty ligand binding site or optionally including 
some portion(s) of a known compound(s). These methods include: 

LUDI (Biosym Technologies, San Diego.CA) [H.-J. Bohm, "The Computer 
Program LUDI: A New method for the De Novo Design of Enzyme Inhibitors", J. 
Comp, AidMolec, Design, 6:61-78 (1992)]. 

LeapFrog (Tripos Associates, St Louis, MO). Other molecular modeling 
techniques may also be used. For example,., N.C. Cohen et al. "Molecular Modeling 
Software and Methods for Medicinal Chemistry". J.Med.Chem„ 33:883-894 (1999). 
M.A. Navia and M. A. Murcko, "The Use of Structural Information in Drug Design", 
Current Opinions in Structural Biology. 2:202-210 (1992). For example, where the 
structures of test compounds are known, a model of the test compound may be 
superimposed over the model of the structure of the invention. Numerous methods and 
techniques are known in the art for performing this step, any of which may be used. See, 
e.g., P.S. Farmer, Drug Design, Ariens, EJ., ed., Vol. 10, pp 1 19-143 (Academic Press, 
New York, 1980); U.S. Patent No. 5,331,573; U.S. Patent No. 5,500,807; C. Verlinde, 
Structure. 2:577-587 (1994); and I.D. Kuntz , Science. 257 :1078-1082 (1992). The model 
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building techniques and computer evaluation systems described herein are not a 
limitation on the present invention. 

LEGEND (Molecular Simulations, Burlington, MA) [Y. Nishibata and A. Itai, 
Tetrahedron 47 :8985 (1991)]. 

Using these computer evaluation systems, a large number of compounds may be 
quickly and easily examined and expensive and lengthy biochemical testing avoided. 
Moreover, the need for actual synthesis of many compounds is effectively eliminated. 

Apparatus including the IR fitted quaternary structure or other IR structural 
information 

Storage media for the IR fitted quaternary structure or other IR structural 
information include, but are not limited to: magnetic storage media, such as floppy discs; 
hard disc storage medium, and magnetic tape; optical storage media such as optical discs 
or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. Any suitable computer readable 
mediums can be used to create a manufacture comprising a computer readable medium 
having recorded on it an amino acid sequence and/or data of the present invention. 

"Recorded" refers to a process for storing information on computer readable 
medium. A skilled artisan can readily adopt any of the presently know methods for 
recording information on computer readable medium to store an amino acid sequence, 
nucleotide sequence and/or EM data information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon an amino acid sequence and/or data 
of the present invention. The choice of the data storage structure will generally be based 
on the means chosen to access the stored information. In addition, a variety of data 
processor programs and formats can be used to store the sequence and data information of 
the present invention on computer readable medium. The sequence information can be 
represented in a word processing text file, formatted in commercially-available software 
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such as WordPerfect and MicroSoft Word, or represented in the form of an ASCII file, 
stored in a database application, such as DB2, Sybase, Oracle, or the like. A skilled 
artisan can readily adapt any number of data processor structuring formats (e.g. text file 
or database) in order to obtain computer readable medium having recorded thereon the 
information of the present invention. 

By providing the sequence and/or data on computer readable medium and the 
structural information in this application, a skilled artisan can routinely access the 
sequence and data to model a receptor a subdomain thereof, or a ligand thereof. As 
described above, computer algorithms are publicly and commercially available which 
allow a skilled artisan to access this data provided in a computer readable medium and 
analyze it for molecular modeling or other uses. 

The present invention further provides systems, particularly computer-based 
systems, which contain the sequence and/or data described herein. Such systems are 
designed to do molecular modeling for an IR or at least one subdomain or fragment 
thereof. 

As used herein, "a computer-based system" refers to the hardware means, 
software means, and data storage means used to analyze the sequence and/or data of the 
present invention. The minimum hardware means of the computer-based systems of the 
present invention comprises a central processing unit (CPU), input means, output means, 
and data storage means. A skilled artisan can readily appreciate which of the currently 
available computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a 
data storage means having stored therein our IR or fragment sequence and/or data of the 
present invention and the necessary hardware means and software means for supporting 
and implementing an analysis means. As used herein, "data storage means" refers to 
memory which can store sequence or data (coordinates, distances, quaternary structure 

37 



etc.) of the present invention, or a memory access means which can access manufactures 
having recorded thereon the sequence or data of the present invention. 

As used herein, "search means" or "analysis means" refers to one or more 
programs which are implemented on the computer-based system to compare a target 
sequence or target structural motif with the sequence or data stored within the data 
storage means. Search means are used to identify fragments or regions of an IR which 
match a particular target sequence or target motif. A variety of known algorithms are 
disclosed publicly and a variety of commercially available software for conducting search 
means are and can be used in the computer-based systems of the present invention. A 
skilled artisan can readily recognize that any one of the available algorithms or 
implementing software packages for conducting computer analyses that can be adapted 
for use in the present computer-based systems. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequences(s) are 
chosen based on a three-dimensional configuration or electron density map which is 
formed upon the folding of the target motif. There are a variety of target motifs known in 
the art. Protein targets include, but are not limited to, ligand binding sites, structural 
subdomains, epitopes, and functional domains. A variety of structural formats for the 
input and output means can be used to input and output the information in the computer- 
based systems of the present invention. 

One application of this embodiment is provided in FIG. 13. This FIG. provides 
a block diagram of a computer system 5 that can be used to implement the present 
invention. The computer system 5 includes a processor 10 connected to a bus 15. Also 
connected to the bus 1 5 are a main memory 20 (preferably implemented as random access 
memory, RAM) and a variety of secondary storage memory 25 such as a hard drive 30 
and a removable storage medium 35. The removable medium storage device 35 may 
represent, for example, a floppy disk drive, A CD-ROM drive, a magnetic tape drive, etc. 
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A removable storage unit 40 (such as a floppy disk, a compact disk, a magnetic tape, etc.) 
containing control logic and/or data recorded therein may be inserted into the removable 
medium storage medium 35. The computer system 5 include appropriate software for 
reading the control logic and/or the data from the removable medium storage device 35 
once inserted in the removable medium storage device 35. A monitor 45 can be used as 
connected to the bus 1 5 to visualize the structure determination data. 

Amino acid, encoding nucleotide or other sequence and/or data of the present 
invention may be stored in a well known manner in the main memory 20, any of the 
secondary storage devices 25, and/or a removable storage device 40. Software for 
accessing and processing the amino acid sequence and/or data (such as search tools, 
comparing tools, etc.) reside in main memory 20 during execution. 

One or more computer modeling steps and/or computer algorithms are used as 
described above to provide a molecular 3-D model, preferably showing the fitted 
quaternary structure, of a cleaved dimeric receptor, using amino acid sequence data and 
atomic coordinates for the receptor. The structure of other dimeric receptors such as 
IGFR and IRR may be readily determined using methods of the invention and the present 
knowledge of these receptors. 

Assays of modulators identified from IR structure 

Once identified, the modulator may then be tested for bioactivity using standard 
techniques (eg. in vitro or in vivo assays). For example, the compound identified by drug 
design may be used in binding assays using conventional formats to screen agonists (eg 
by measuring in vivo or in vitro binding of receptor to insulin after addition of a 
compound). One assay is the fat cell assay for glucose uptake and oxidation which is 
known in the art Experiments may also be done with whole diabetic animals. Suitable 
assays include, but are not limited to, the enzyme-linked immunosorbent assay (ELISA), 
or a fluorescence quench assay. In evaluating IR modulators for biological activity in 
animal models (e.g. rat, mouse, rabbit), various oral and parenteral routes of 
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administration are evaluated. Using this approach, it is expected that modulation of an IR 
occurs in suitable animal models, using the ligands discovered by molecular modeling. 

Once identified and screened for biological activity, these inhibitors may be used 
therapeutically or prophylactically to modulate IR activity as described below. 

Pharmaceutical/diagnostic formulations of modulators identified from quaternary 
structure, methods of medical treatment and uses 

1 . Modulating IR in a Cell 

The present invention also provides a method for modulating the activity of the IR 
in a cell using IR modulating compounds or compositions of the invention. In general, 
compounds (antagonists or agonists) which have been identified to inhibit or enhance the 
activity of IR can be formulated so that the agent can be contacted with a cell expressing 
a IR protein in vivo. The contacting of such a cell with such an agent results in the in vivo 
modulation of the activity of the IR proteins. So long as a formulation barrier or toxicity 
barrier does not exist, agents identified in the assays described above will be effective for 
in vivo and in vitro use. These modulators may be used in therapies that are beneficial in 
the treatment of diabetes and other diseases, disorders and abnormal physical states 
characterized by improper or inadequate insulin receptor activity. Even if receptor 
activity is normal, there may be therapeutic benefit in upregulating or downregulating its 
activity in some circumstances. 

2. Medical Treatments and Uses 

Diseases, disorders and abnormal physical states that may be treated by IR 
agonists include diabetes and hyperlgycemia. Diseases, disorders and abnormal physical 
states that may be treated by IR antagonists include hypoglycemia. 

Isofonn A of IR is shorter than isoform B by 12 amino acids which are coded by 
exon 1 1 of the IR gene. Isoform A interacts with insulin and produces the same effect as 
isoform B, which is a metabolic effect. Isoform A acts as an IGF-2 receptor which may 
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be important in the growth of cancer cells (Frasca, F, Pandini, G, Scalia, P, Sciacca, L, 
Mineo, R, Costantino, A, Goldfine, ID, Delfiore, A, Vigneri, R, 1999, Insulin receptor 
isoform A: A newly recognized high affinity insulin like growth factor II receptor in situ 
and cancer cells. Molecular and Cellular Biology 19:5 pg. 3278-3288.). IGF-2 acts on 
isoform A to produce a growth effect via IR rather than just a metabolic effect. The 
quaternary structure of isoform A is very similar to isoform B and can be readily 
determined according to the information in this application. IGF I binds to both isofonns 
with low affinity (1/10) and also produces a growth effect (less significant because of the 
low affinity binding). One may design an antagonist of isoform A that does not interact 
with isoform B (or at least has lower affinity binding to isoform B) to inhibit cancer cell 
growth in response to IGF-2. 

3. Pharmaceutical Compositions 

Modulators may be combined in pharmaceutical compositions according to 
known techniques. The compounds of this invention are preferably incorporated into 
pharmaceutical dosage forms suitable for the desired administration route such as tablets, 
dragees, capsules, granules, suppositories, solutions, suspensions and lyophilized 
compositions to be diluted to obtain injectable liquids. The dosage forms are prepared by 
conventional techniques and in addition to the compounds of this invention could contain 
solid or liquid inert diluents and carriers and pharmaceutical^ useful additives such as 
lipid vesicles liposomes, aggregants, disaggregants, salts for regulating the osmotic 
pressure, buffers, sweeteners and colouring agents. Slow release pharmaceutical forms 
for oral use may be prepared according to conventional techniques. Other pharmaceutical 
formulations are described for example in US 5,192,746. 

Pharmaceutical compositions used to treat patients having diseases, disorders or 
abnormal physical states could include a compound of the invention and an acceptable 
vehicle or excipient (Remington's Pharmaceutical Sciences 18 th ed, (1990, Mack 
Publishing Company) and subsequent editions). Vehicles include saline and D5 W (5% 
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dextrose and water). Excipients include additives such as a buffer, solubilizer, 
suspending agent, emulsifying agent, viscosity controlling agent, flavor, lactose filler, 
antioxidant, preservative or dye. The compound may be formulated in solid or semisolid 
form, for example pills, tablets, creams, ointments, powders, emulsions, gelatin capsules, 
5 capsules, suppositories, gels or membranes. Routes of administration include oral, 
topical, rectal, parenteral (injectable), local, inhalant and epidural administration. The 
compositions of the invention may also be conjugated to transport molecules to facilitate 
transport of the molecules. The methods for the preparation of pharmaceutical^ 
acceptable compositions which can be administered to patients are known in the art. 

10 The pharmaceutical compositions can be administered to humans or animals. 

Dosages to be administered depend on individual patient condition, indication of the 
drug, physical and chemical stability of the drug, toxicity, the desired effect and on the 
chosen route of administration (Robert Rakel, ed., Conn's Current Therapy (1995, W.B. 

m 

Saunders Company, USA)). 
|C 15 Example 1 - Determination of the 3D structure of IR 

\± Preparation of IR 

w 

Insulin receptor protein (HIR) was solubilizedfrom human placental membranes 
)0 and purified by affinity chromatography on an insulin column (9) followed by further FPLC 

purification on Sephacryl S-200. The purity of HIR was better than 95% by sodium 
20 dodecy 1 sulfate polyacrylamide gel electrophoresis. HIR was incubated with NG-BI (final 
concentration of -0.5 x 10* M) at 4° C overnight in 20 mMHEPES buffer (pH 7.5) at a 
molar ratio of insulin:HIR of- 10: 1 . Free NG-BI was removed by microfiltrationwith a 
cut-off of 300 kDa (Sigma). The mixture was diluted to 7.5 ng of receptor protein/ml with 
20 mM HEPES buffer, pH 7.5, prior to loading on the grid. 



iS 
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Preparation of Specimen for STEM 

The specimen (5 |al) was injected into 5 jd of the dilution buffer on 300-mesh 
copper grid coated with a holey plastic film overlaid with a carbon film 23 A thick, washed 
with HEPES buffer and 10 mM ammonium acetate (pH 7.5). The grid was drained by 
wicking with filter paper, leaving a very thin solution layer, then immediately quick-frozen 
by plunging into liquid ethane at -150°C. The frozen specimen was transferred at liquid 
nitrogen temperature into the STEM (V acuum Generators, Model HB60 1 UX) and freeze- 
diied at -140° C in the STEM cold-stage. Images in a 480 x 480 pixel format were acquired 
with the specimen at - 1 50°C using cold field emission at 1 00 kV, a dose of 6e/A 2 and a 
pixel size of 6.5 A. The beam size was 3 A. Inelastic and annular dark field signals were 
detected simultaneously. 

Nanogold Marking 

The quaternary structure of IR bound to insulin was determined by marking with 
Nanogold. The 70 atom gold marker localized and delimited the insulin binding site. 
Compared to native bovine insulin, NanogoId-bovine-insulin(NG-BI), derivatized at the B- 
chain Phe 1 (5), a location not directly involved in receptor binding (6), bound to human 
insulin receptor (HIR) with only a slightly reduced affinity (FIG. 1 ). Purified solubilized 
HIR used in this study has been shown to be fully active (7). Such HIR, incubated with 
NG-BI to form the HIR/NG-BI complex in the absence of ATP, was subjected to low-dose 
dark field STEM imaging at -1 50° C. FIG. 2A shows a representative field of individual 
molecules. On average, each HIR/NG-BI complex measured 1 5 nm across. Based on its 
strong scattering, the 1 .4 nm gold ligand of NG-BI was located on the image directly as a 
clear site of highest density, or could be demonstrated as such by thresholding. FIG. 2B 
shows examples of molecules with 1 or 2 sites of highest density, indicative of binding of 
one or occasionally two NG-BI particles, consistent with the known binding of between one 
and two insulins per IR (3). When two NG-BI particles were detected, they were in close 
proximity to each other. 
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Image Reconstruction 

Approximately 700 images were selected for reconstruction on the basis of having a 
definite site of high density, the expected mass for the complex, being structurally 
contiguous, and being separated from neighbouring images. The 3D reconstructionsof the 
HIR/NG-BI complex are shown in FIG. 3 . The interpreted alignment and the fit of the 
biochemical domains to this structure are detailed in FIG. 4. The 3D structure at the full 
expected volume is compact and globular (FIG. 3 A, top panel). The NG-BI particle was 
located on the 3D reconstructionby increasing the density threshold without imposing 
symmetry (FIG. 3 A, panel 2 and 3), to pinpoint the binding site and to limit the fit of insulin 
to its vicinity within the IR complex. Since insulin binds to the L l-Cys-rich-L2 regions of 
the ectodomain of IR, the NG cluster identifies this region of IR in the reconstruction. 

Paired elastic and inelastic images were combined to increase the signal-to-noise 
ratio two-fold. Single particles were interactively selected in 64x64 pixel windows using 
the program WEB (Wadsworth Laboratories, Albany NY), and low-pass filtered to L0 nm 
using a Gaussian filter in the program SPIDER (Wadsworth Laboratories, Albany NY). The 
molecular mass was calculated relative to the 23 A carbon support with a density of 2.0 
g/cm 3 . The particles had a Gaussian mass distribution with a modal mass of 570 kDa, 
which includes the mass of 480 kDa for the HIR and NG-BI plus the weight for an 
estimated 150 Triton X-l 00 molecules. Particle images were "grown" from a central high 
density in expanding contiguous contour levels to a global cut-off corresponding to the 
average mass. Relative orientations were computed as before (N. A. Farrow and F. P. 
Ottensmeyer,J. Opt. SocAm. A9, 1749 (1992);N. A. Farrow andF. P. Ottensmeyer, 
Ultramicroscopy 52, 141 (1993); G. J. Czamota,D. W.Andrews,N. A.Farrow,F. P. 
Ottensmeyer,./ Structural Biology 113, 35 (1994); G. J. Czarnota,D. P. Bazett-Jones,E. 
Mendez, V. G. Allfrey, F. P. Ottensmeyer, Micron 28, 41 9 (1997)) and 3D reconstructions 
were performed by filtered back-projectionusing an angular distribution-dependentfilter. 
Measurements of resolution were obtained via Fourier shell phase residual calculations 
between reconstructions of two independent sets of half of the 704 images each (G.J. 
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Czarnota,D.W. Andrews, NA. Farrow, F.P. Ottensmeyer,^ Struct. BioL 113, 35 (1994)). 
Calculations were carried out on an SGI Indigo workstation (Silicon Graphics Inc., 
Mountain View, CA). The program IRIS EXPLORER 2.0 (SGI, Mountain View, CA) 
displayed the 3D reconstructions. To show domain relationships and structural links, the 
reconstructions were displayed with intermediate densities between 5% and 1 0% higher 
than the average density for the full volume. INSIGHT II (Molecular Simulations Inc., San 
Diego, CA) was used to dock known crystal structures and approximate models. 
Handedness of the construct was determined by fitting the x-ray crystallographic structure 
of tyrosine kinase domain into mirror pairs of the 3D reconstruction. 

Example 2 - Structural Characteristics of IR 

Domain-like features of the structure become evident at intermediate density 
thresholds (FIG. 3 A, panel 2), and, except for the NG-BI region, these indicate a strong 2- 
fold vertical rotational symmetry as anticipated from the dimeric configuration of the 
oligotetrameric (ctp^ structure of IR. This symmetry was used to reduce noise in the 
reconstructions and render the structures shown in panel 1 and in FIG. 3B, as being viewed 
in the plane of the membrane, and in the extracellular (top) and intracellular (bottom) 
perspectives. Views of these structures are reminiscent of the X- and Y-shaped electron 
microscopic images previously observed for IR or its ectodomain. 

In the side views, the top part of the structure, where NG is located, is identified as 
the ectodomain of the a subunit The dog-bone-shaped substructure of the 3D 
reconstruction, (FIG. 3B, top view), and equivalentlythe top-most, bow-tie-shaped structure 
(FIG. 3B, 0°), are designated as the two LI domains of the dimeric receptor on the basis of 
the x-ray structureof the Ll-Cys-rich-L2 domains. The side view at 65° shows the LI -Cys- 
rich-L2 domains as contiguous substructures across the upper central region of the 
molecule, with enough additional volume in this region to account for most of the 
remaining mass of the two a subunits, primarily the connecting domains (CD). 
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The contiguity of the domain structure (FIG. 3B, top and side view 90°), along with 
the primary domain sequence (FIG. 4A), shows that the two P subunits occupy the lower 
half of the structure, distal from LI, reaching up and out as a contiguous mass. The 
intracellular TK domain of IR would then occupy the bottom portion of mis structure with 
two IR fibronectintype HI (FrJII) repeats in each receptor half being in the top portion of 
the crescent-shapedspiral of the p subunit at the same level as the L2 domain in the a 
subunit One of the Fnffl repeats, composed of residues from both the a and p subunit, is 
assigned to the upper left end of the crescent (side view, 0°) where it is contiguous with the 
CD portion of the a subunit (top view). FIG. 4C and 4D (cf. FIG. 3B, 90°, top view, 
respectively) show the fitting of the crystal structure of the TK domain (green) of the p 
subunit and of the two FnDI repeats (blue/red) modelled as the canonical fibronectintype ffl 
structures (16). 

The masses of the kinase domains are connected via a slender horizontal bridge 
(FIG. 3B, side view 90°) that was not observed in the x-ray structures of the TKs, but can be 
explained in terms of the reconstructionbeing in a transition between free IR and its ligand- 
activated form. In the two symmetrically fitted TK (FIG. 4C and 4D) crystal structures the 
catalytic loops are separatedby 4 nm. This distance isjustsufficientto permit the tyrosine 
triplet (Tyr 1 1 58, 1 1 62 and 11 63) in a fully extended flexible activation loop of one TK to 
reach the catalytic loop of the opposite TK as modelled from the x-ray coordinates (PDB 
HRp). The extension of the activation loops, equivalent in cross-sectionto four extended 
polypeptide chains, easily accounts for the linking density observed between the lower 
portions of the P subunits (FIG. 3B, 90°). This is an importantdifference from the x-ray 
structures of the inactiveand activatedTKs as discussed below. 

The spatial relationship between the domains of the a and P subunits (e.g. side 
view, 90°) shows the location of the cell membrane lipid bilayer as the space belowthe a 
subunits and above the bridge Unking the two assigned TK domains. Instead of a flat open 
region, this space in the 3D reconstruction forms a thick dome-like slab above the bridge 
with a thickness variation of 2.2 to 2.7 nm. This spacing is a change in shape from, and a 

46 



decrease in the thickness expected for a membrane bilayer that would accommodate an 
alpha-helical transmembrane domain (TM) of 23-26 hydrophobic amino acids. However, 
since the purified IR in the absence of its native membrane was fully active, the relative 
positions of the extracellular and intracellulardomains must still represent a close to native 
arrangement 

The crossing Ll-Cys-rich-L2 domains of the dimeric a subunits were presented 
(FIG. 4B and 4C). We determined the x-ray coordinates (5) (See FIG. 7). Nonetheless, 
using this structure, the localization of the gold cluster, and the known receptor-binding 
domain of insulin (8), we have tentatively fitted an NG-BI molecule into this region. The 
best fit is obtained with a molecule of insulin, partially on the two-fold symmetry axis of 
the dimer, being in contact with the Ll-Cys-rich domains of one a subunit and with the 
L2 domain of the other a subunit. A model involving both a subunits in the high-affinity 
binding of insulin has previously been proposed based on studies of insulin analogues 
binding to IR and IR/IGF-I R chimeras (8). Our 3D reconstruction shows this 
involvement Although two molecules of insulin can be fitted to this configuration, two 
molecules of Nanogold-labeled insulin were observed only rarely in the STEM images. 
The high-affinity binding of the first insulin molecule to the IR has induced a 
conformational change in the binding domain so that the second insulin molecule would 
bind only at low affinity. Likewise the binding of a second molecule of insulin could 
effect a conformational change that enhances the dissociation of the bound insulin. Thus 
the curvilinear Scatchard plot and the negative cooperativity of insulin binding (9) can be 
explained on the basis of the 3D reconstruction. The reconstruction also explains why 
only low-affinity binding is obtained with purified ctp monomer. 

Superimposition of known crystal structures of smaller domains of the receptor on 
substructures of the 3D reconstruction has made it possible to deduce the spatial 
relationship among the domains in the complex. The structure shows the division of the 
complex into the extracellular and the cytoplasmic segments along a plane, the cell 
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membrane, on which the fibronectintype III repeats lie (1 6-18). These repeats appear 
pontoon-liketo supportthe centrally located insulin-binding segment of the ectodomain. 

MonomericinactivereceptorTKs such as EGFRare brought together by ligand 
binding and become activated as dimers resulting in TK autophosphorylation.In the 

5 intrinacaUydin^ 

subunit TKs within the dimer must be too great without ligand binding for the activation of 
thekinase. Hubbardef a/. (4) suggestedthatiiisuUnbindmgtoIRdecreasedthis distance 
by disengaging Tyrll62 from the catalytic loop to enable trans phosphorylation^ the 
presence of ATP. In our reconstructiona good fit to the ligand-receptorcomplex is 

10 obtamedwhenmetwoTKdormunsareori 

this orientationthe extended flexible activation loop of each TK, which moves 30 A 
between the inactive and activated states in the crystal structures (4), can just reach the 
catalytic loop of the opposingTK to be activated. These two loops can easily form the 
linking mass density between the TKs seen in the 3D reconstructionin the absence of ATP. 

15 The 3D structure obtained from images of the HIR-complex containing only a single 

NG-B1, shows that one molecule of insulin is sufficient to bring the two «P monomers to an 
activating configuration. The dimeric receptor with a Ser323Leu mutation in the L2 domain 
of both a subunits showed a severe impairment in insulin binding, whereas a hybrid 
receptorwith only one of the two a subunits mutated was found to bind insulin with high 

20 affinity and was fully active as a tyrosine kinase. Based on our 3D reconstruction,iiisulin 
bound to the LI domain of the mutant a subunit and the wild-type L2 domain of the hybrid 
IR and the binding of only a single molecule of insulin is sufficient for TK activation. 

Thus we have obtained the 3D quaternary structure of the IR-insulin complex 
formed in the absence of ATP. The structure was an intermediatebetween insulin-free IR 
25 and the fully activated, phosphorylatedER The reconstructionis readily interpretedas such: 
as a receptorpoised for activation by ^^phosphorylation. We determine the full extent 
of conformational changes inducedby insulin binding. We reconstruct the initial state of 
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free IR and the final activated state for comparison. The 3D reconstruction presented here 
provides concrete structural information towards the full understanding of transmembrane 
signal transmission in insulin action. Furthermore, the approach used in this study can be 
applied to obtain the quaternary structure of other membrane proteins or receptors that are 
5 refractory to crystallization. The invention includes the methods for studying polypeptide 
structure described in this application. 

Example 3 - Mechanics of Transmembrane Signalling of the Insulin Receptor 

The binding of insulin to the extracellular domain of the insulin receptor (IR) begins 

an intracellular signal cascade that ends in numerous insulin-specificcellular responses. The 

1 0 binding event activates the intracellular tyrosine kinase (TK) domain of the receptor. How 

^ the signal is transmitted across the cell membrane has remained a mechanistic puzzle, since 

*!? complete membrane receptors have been refractory to high resolution structural studies by 

£P; NMR spectroscopy or by crystallography. In an alternative approach we have used low- 

v^j dose low-temperature dark field scanning transmission electron microscopy (STEM) to 

lM is determine the three-dimensional quaternary structure of the entire isolated 480 kDa human 

¥ insulin receptor bound to insulin 1 . Recently the atomic co-ordinates of individual N- 

jli terminal domains of the extracellular region of a highly homologous receptor, the insulin- 

fee 

!fj like growth factor type 1 receptor (IGF-1 R) have become available, as have models of the 

tfj three individual fibronectintype III (Fn) domains of IR 10,31 . We have modified these domain 

20 structures substituting the IR amino acid sequence and accommodating the covalent dimeric 
character of IR. The IR TK domain structures were available previously** 9 . All of these 
domains were fitted into the quaternary structure calculated from STEM micrographs. The 
fit provides a detailed description of the insulin binding site of IR and of its interactions 
with insulin. Moreover, the entire 3D complex is a molecular machine with intrinsic 
25 linkages that provides a mechanistic model for transmembrane signal transduction by IR. 
Since ER is constitutively dimeric 2 , the mechanism of ER signal transduction is of necessity 
different from that of many receptors activated by ligand-induced dimerization. Instead, the 
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bmdmgofinsulmchangesto^ 

one that is openly permissive of TK transphosphorylation. 

The structure and model explain observations on insulin binding, on disulphide 
modificationslinkingthe two monomers and lir^g to cor^utuent domains, the block to 
TK acnvation,dominantnegativemutations,msulm-dependentand insulin-independent 
autophosphorylation,and transmembrane modifications. Moreover, the model is 
sufficiently general to serve as an archetype for dimeric two-state receptors like IR that are 
activated or inhibited by ligand binding. 

The 3D structure determined at 20 A by reconstmctionfrom electron micrographs 
of sets of single insulin-boundlR complexes 1 is shown in FIG. 5, with views as seen from 
the exterior of the cell membrane (FIG. 5a(i)), the interior of the cell (FIG. 5a(iii)),andat 
90° from these in the plane of the membrane (FIG. 5a(ii». Antibody labelling has recently 
confirmed the locationof three pairs of the assigned ectodomainregions 3 . 

Covalent linking of the two monomers of IR occurs between Cys524 of each 
monomer, and also between corresponding Cys682 (or 683 or 685) moieties 4 " 7 . Each 
monomeritselfcontainsa 135kDaa subunitanda95 kDaP subunit linked by a single 
disulphide bond (aCys647 to pCys872)\ The structure of one monomer is diagrammed in 
FIG. 6. From considerationsof symmetry of the (ctPX dimer.the two a-a disulphide 
bonds 5 - 7 occur one above the other on the two-fold symmetry axis of the dimer (labelled 1 
and 2, FIG. 6). In the interpretationof the 3D structure, two polypeptide chains link the P 
subunit from fibronectindomainFnl to the connecting domain CD/FnO and insert domain 
ID of the central a subunit 

Crystal structures were determined only for parts of IR: the intracellular TK domain 
in the unphosphorylatedstate as well as phosphorylated and bound to a peptide substrate", 
and the first three extracellulardomains, L 1 , Cys-rich, and L2, of the homologous type 1 
insulin-likegrowth factor receptor aGF-lR) 10 . From analysis of sequence homology each 
a p monomer contains three fibronecti^ 
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transmembrane and juxtamembrane regions and the ED and C-terminal domains of the fi 
subunit are still of unknown structure. 



Example 4 -Docking of L1-CR-L2 

The atomic co-ordinates of the LI -CR-L2 regions of IGF-1R (PDB: 1IGR) were 
5 used to substitute and insert corresponding amino acids for IR into the IGF-1 R structure. 
Additional loops that do not exist in IGF-1 R, e.g. amino acids 272-275, were inserted where 
necessary. This was followed by several rounds of molecular dynamic calculations using 
the program Insightn (Molecular Simulations, San Diego, CA) to eliminate atomic clashes 
and to approach a corresponding energy minimum for the IR sequence. No rotations of the 
10 LI, CR, or L2 domains relative to each other were carried out during any of the procedures. 
_ Two IR-based L 1 -CR-L2 structures, one for each IR monomer, were then docked 

W symmetrically into the central ectodomain of the quaternary IR dimer structure according to 



fh the domain sequence scheme proposed previously 1 . Several other symmetric configurations 

were tested as well, such as reversing the positions for LI and L2 or rotating the L1-CR-L2 
l fi 1 5 structure to extend L2 into the regions designated for the CD/FnO domains. The final fit 
* maximized overlap of the EM-based mass with the atomic structure, while avoiding overlap 
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of the atoms of the two L1-CR-L2 cross-over regions (FIG. 7a). Moreover, this 



ft configuration resulted in an additional fit of loops in the L 1 regions to slender masses 

iO extending from the correspondingregions of the EM structure (FIG. 7b) and provided 

yg 

20 atomic confirmation for the cam-like structures on the CR regions (FIG. 7c). These cam- 
like structures are formed by a loop of amino acids from 250 to 280 that is stabilized by a 
disulphide bond between Cy s266 and Cys274 32 . 

Example 5 - Insulin Binding Region 

The fit of the two L 1 -CR-L2 regions formed a diamond-shaped central tunnel 
25 (FIG. 7a). Each CR domain and the juxtaposing L2 surface of the opposite monomer 
formed one side of the diamond, proximal to the membrane. The other two sides were 
formed, one each, by the L2-facing surface of LI 10 . This arrangement lined the tunnel with 
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almost all of the amino acids that are linked to the binding of insulin. The atomic structure 
of human insulin (PDB: 1 BEN) fitted into this tunnel as shown in the stereo view in FIG. 8a, 
involving binding sites on both monomers. Insulin interaction with one monomer involved 
major hydrophobicareasonmeinsulinBchain(ValB12,TyrB16,LeuB17 > andPheB24to 
TyrB26)and on LI (Leu87 to Phe89, and Tyr91), as well as interactionsbetweenGluB21 
on insulin and His247 and Gln249 of the CR region (FIG. 8b). Interaction with the other 
monomer was predominantly electrostatic with no obvious hydrophobic components 
(FIG. 8c). These interactions and others are given in Table 1 , as are some of the distances 
between interacting side chains. 

Oneoverridingconstraintonthedockingofinsulinwastheneedto satisfythe 
locationof the Nanogold label attached to PheBl of insulin for electron microscopy 1 . This 
requirement was easily satisfied by flexing the insulinB chain between aminoacids 1 to 6, a 
motion that appears to occur naturally, as judged by the position of the B chain in different 
crystal structures of the molecule 34 . The fit indicated that the gold marker location was 
closesttoLl ofmemonomermteractmgelectrostaticaUywiminsulm(Figs. 8a and 8c). 

Example 6 • Fibronectin Linkers 

The linkage in the ectodomain between the L1-CR-L2 regions and the IR 
transmembrane domain is via three fibronectintype III (Fn) domains and two so-called 
insert domains, one each on the a and p subunits of each monomer. This region also 
provides the two disulphide bonds that covalently link the ap monomers to form the 
constitutiveIR dimer. One disulphide bond occurs between the FnO domains of the a 
subunits, the other between corresponding a insert domains (FIG. 6). Two of the Fn 
domains, Fnl and Fn2 are not involved in dimer formation, and have been modelled into 
the 3D reconstructionpreviously as the normal seven-beta-strandfibronectintype III 
structure 1 , even though Fnl is made up of four beta strands from the a subunits and three 
from the psubunit*. 
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In relation to our quaternary IR dimer structure, the a insert domain is modelled to 
lead out of theFnl domain across to the CD/FnO region, and then to lie against the near side 
of the L2 domain until it reaches the diad axis of the dimer. Here it forms a disulphide bond 
with its symmetric partner insert domain. The location of the remaining 34 amino acids of 
5 this domain is unknown, although the final 1 2 residues appear to assist in insulin binding 2 . 
This shows that the peptide chain either remains near the central region or returns to the 
centrally located binding site. 

The structure of the most N-terminal Fn domain, FnO, designated CD in prior 
descriptions 11,31 , is more problematical. The domain sequence of the quaternary structure 
10 shows that FnO is located at the extreme ends of the central region of the IR ectodomain 1 . 
The same conclusion is reached from the location and accessibility of monoclonal 
£3 antibodies and Fab fragments against this region 3,33 . At the same time, the location of the a- 

rfi a disulphide bond at Cys524 within this region requires that this domain extend to the diad 

*\ symmetry axis of the IR dimer. To accommodate both requirements, the FnO domains were 

^ 15 placed at the ends of the central ectodomain. However, a hairpin structure, containing the 

Cys524 loop and two neighbouring beta strands of the seven-strandedFn configuration, was 
^ unfolded from the Fn beta sandwich and lay ed against the contiguous L2 domain on the 

side opposite the insert domain loop placement above. This manoeuver permitted the 
fH Cys524 residue to reach the diad axis and form the second a-a disulphide bond. In addition, 

j|j 20 Fn-like configuration of this domain still easily accommodated the internal linkage to the C- 
terminal of L2, provided an exposed location of the monoclonal epitope between residues 
535 and 548 3U3 , and retained the normal location of the FnO C-terminal, suitably positioned 
for the flexible linkage leading into Fnl (FIG. 6). Moreover, the additional size of this Fn 
region(122 aminoacids versus 106 and 97 for Fnl andFn2, respectively)provided enough 
25 mass to accommodate the volume of this region in the EM reconstruction. 
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Example 7 - Physical model for transmembrane signalling 

In contrast to activation of monomer membrane receptors, activation of the IR 
tyrosine kinase cannot be caused by ligand-induced dimerizauon, since IR is intrinsically 
dimeric. However, the articulated structural features of the IR dimer indicate obvious 
mechanical arrangement that permits transmembrane signalling and intracellular 
recognitionboth of the absence of insulin on the receptor and of insulin bmding to it. 

FIG. 5A shows that the central, extracellularregion of the two sets of contiguous 
domains from LI to FnO is flanked on both sides by the pontoon-likeFnl/Fn2 domains, 
which are tethered asymmetrically only betweenFnl andFnO. The two Fn2 ends, which 
terminated at the juxtamembrane and transmembrane (TM) domains, are held away from 
the central regions by the bumper-like cam structuresof thetwo symmetry-relatedCR 
domains. The intracellular^ domains are then influenced by the TM and juxtamembrane 
domains to which they are attached. 

Nuclear magnetic resonance studies have shown that helical TM domains, similar to 
thelRTM, cannot transmit a signal longitudinally along their lengths 37 . Atmostatrsional 
force can be exerted by them. However, they can shift laterally within the membrane. This 
provides a simple and direct means for transmembranesignalling for IR. 

The structural basis for the proposed mechanism of IR transmembrane signal 
transductions depicted in FIG. 9, pared to a two-dimensionalrepresentation.In the inactive 
state (FIG. 9A) the p subunittransmembrane regions and the associated intracellular TKs 
are held apart by the cam-like blocks on the central portion of the dimeric a ectodomain. 
The open extracellular structure of the IR dimer shows that the two sets of Ll-CRregions 
are splayed apart. When a single insulin molecule with its two different binding regions' 5 
attaches to a contralateral pair of the four binding sites of the two a subunits 16 , the bumper- 
like cam regions are rotated and lifted out of the way of the extracellular domains of the p 
subunits (FIG. 9B). The closed structure is based on the 3D roonstracuon 1 . 
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A more realistic depiction of the contiguous three-dimensional structural features of 
the IR dimer (FIG. 5A), that alternately permit and prevent TK activation, is the set of 
connected cylinders in Figs. 5b and 5c. The perspective of FIGS. 5B(ii) and 5C(ii) is similar 
to FIG. 9. The insulin-binding domains, L 1 and Cys-rich (CR), of each monomer (one 
blue, one fuchsia), cross symmetrically near the middle of the structure. They are attached 
to the L2, CD/FnO and ID domains, modelled as contiguous central barrel structures joined 
together on the two-fold symmetry axis via the two inter-monomer disulphides (labelled 1 ,2 
in Figs. 5B and 5C). The cam-like protrusions on the CR domains, represented as discs, 
abut the Fn2 domains of the P subunits. These protrusions can just be seen in the high- 
density representationof the 3D reconstruction (cam, FIG. 5 A). The mass of the cam 
reaches across from the centre to the Fn2 region in the full-volume representation (FIG. 
7B). Near the CD/FnO ends of the barrels, each a subunit structure extends sideways to help 
form the Fnl repeat and to tether each 0 subunit by a flexible joint to the central structure. 

The N-terminal domain of the p subunit starts near the CD/FnO side arm of the a 
subunit (FIG. 6), leading into Fnl and Fn2 of the extracellular domain of IR 
(Figs. 5B and 5C). At that point the p subunit forms an axle-like transmembrane (TM) 
region 4 , crossing the membrane before folding into the TK domain. Flexible activation 
loops (A) of both TKs 8 ^ are modelled as extending towards the catalytic region of the 
opposite TK (FIG. 5C (iii)). 

The insulin ligand, depicted as a red disk, binds slightly asymmetrically with respect 
to the two-fold axis between the two <xp monomers 1 , representative of the high affinity 
binding position (FIG. 5B). It is shown attached to only one monomer (blue) at the 
inception of binding to the open, insulin-free IR dimer (FIG. 5C). 
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Example 8 - Mechanism 

In the inhibitory, insulin-free state (FIG. 5c), a minimum separation is maintained 
between the two intracellular TKs, in spite of thermal motion, by the a-ectodomainCR cam 
regions (black and brown discs) that contact the p-ectodomains at the Fn2fTM domains. 
Consequently, the distance between the intracellularly attached TKs prevents the flexible 
TK activationloop of one TK from reaching the catalytictransphosphorylationsite of the 
other TK W (Figs. 5c(ii and iii), "A" arrow). 

High affinity binding of a single insulin molecule joins the two L 1 -CR-L2 domains 
of the ertodomain(FIG. 5b) againsta small torsional resistance offered by the two on-axis 
disulphide bonds (cf. FIG. 5b(ii) and FIG. 5c(ii)). This action rotates and lifts the cam 
protrusions, such that thermal motion can bring the pair of Fn2/TM-axle regions closer to 
the central barrel of the ectodomain. The reduction in separation between the TM axles 
permits a sufficiently close approach of the associatedTK domainsto allow 
H transphosphorylationof the activationloop at the catalytic locus of the oppositeTK 

p 15 (FIG.5b(iiandiii)). 

^ Whenmsultodetachesfrommerecepto^^ 

F again,as the two strained Cys-Cys linkages return to their equilibriumpositions 

(1 and2,nG. 5c(ii)). At the same time the CR-region cams again restrict the approach of 
the TK domains (FIG. 5c(ii and iii)), increasing their separation, possibly to facilitate 
20 downstream signalling actions. 

Example 9 - The model 

The detaUedmodel of insulmbm^ 
structures into the quaternary structure of the IR dimer, and the proposed mechanism for 
transmembranesignaltransductionexplainmany observations on the behaviour of IR. A 
25 few examples are detailed here. 
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The Insulin Binding Site 

The symmetricjuxtapositionof the IR-adapted L 1 -CR-L2 domains in the structure 
concentrated virtually all of the known binding interactions to insulin into a tunnel-like 
space that readily accomodated the insulin ligand. Both hydrophobic and ionic interactions 
are accommodated involving L 1 , L2 and the CR region. A number of insulin interactions 
change in character as either insulin or IR is modified. These now have structural 
explanations. Experimentally, the interaction of insulin with the CR loop from 243 to 251 
had indicated a strengthening of binding with the introduction of positively charged 
aminoacids into this region 16 . The fitting of insulin into the model binding site indicates an 
interaction of GluB21 of insulin with His247 and possibly Asn249 in the CR loop. The 
presence of the negatively charged Asp250 in this vicinity weakens this interaction. Thus 
the addition of a positive charge in the 243/25 1 loop would clearly enhance the binding of 
insulin by providing a potential salt bridge to the GluB21 residue, while the substitution of 
this His247Asp permits a new ionic interaction with ArgB22. 

Experimentally, a mutation in Phe89 of the LI domain reduces insulin binding 30 . As 
indicated in Table 1 , Phe89 forms part of a hydrophobic region in the insulin binding 
tunnel, that is juxtaposed to a hydrophobic surface on insulin. Any decrease in this 
hydrophobic region would be expected to decrease the strength of insulin binding. 

A mutation of HisB 1 0 in insulin to AspB 1 0 creates a superactive insulin 35 . In the fit 
to the model HisB 10 interacts with Argl4 of L 1 . A stronger ionic interaction would be 
expected to result with the introduction of asperagine in insulin at position BIO. 

Modification oflRon Insulin Binding 

High affinity binding of insulin is initially augmented, then diminished, by 
reduction of the disulphides of IR with increasing concentrations of dithiothreotol (DTT) 17 . 
In the model, normal high affinity insulin binding must overcome an energy barrier created 
by the binding-induced elastic strain in the two ct-oc disulphide bonds on the diad axis of the 
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IR dimer, due to rotation of the two L1-CR-L2 regions to the closed position. Reduction of 
one of the two disulphide bonds eliminates this torsional strain, removing the energy 
barrier, and facilitatinghigh affmitybmding.Fiirtherreduc^onser^tesIR into monomers, 
abrogating high affinity binding, which involves two a subunits in close proximity 17 . A 
similar effect would be expected for a deletion that includes one of the a-a disulphide 
bonds 18 . 

Autophosphorylation 

Basal insulin-independentautophosphorylationofIR occurs naturally at a low level. 
In the model the low levels of autophosphorylationreflect the torsional resistance of the two 
on-axis disulphide bonds which control the position of blocking cams in the insulin-free 
equilibriumposition (FIG. 5c). However, random thermally induced motion is occasionally 
sufficient to rotate the blocking CR cams momentarily to the permissive positions. If 
random motion simultaneously brings the TM regions with their associated TK domains 
close enough together, then a round of transphosphorylationcan occur even in the absence 
of ir^in.ExperimentaUy,suchautophosphorylationis stimulatedby mild reduction with 
DTT, then drops off to zero at higher DTT concentrations 17 . The breakage of either of the 
disulphide bonds would remove the resistance to random rotationto the permissive 
position, resulting in a more frequentrandom approach of the TK domains for 
transphosphorylation.The reduction of both bonds would result in monomeric IR, halting 
transphosphorylationaltogether. 

Deletional Activation 

The IR is activated artificially by removal of amino acids 1 to 578 through tryptic 
digestion 19 . This cleavage still retains covalentlinks between the monomers and between 
the alpha and bete subunits. However, the insulin-bindingregion and the CR domains have 
been removed, along with their physical "cam structures". Thus the |J domains and their 
TKs can move closer together and transphosphorylate,independentof the presence of 
insulin. A more limited deletion which removes part of L2 and most of the CD region 
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activates IR and blunts the action of insulin 18 . Such a deletion removes the physical support 
for the CR cam region of the partner monomer, thus partly collapsing the cam to permit 
rapprochement of the TK regions. At the same time the geometry of the insulin binding site 
in the L2 and CR region would be affected, as well as the insulin-induced change in the 
relative configuration of the entire L 1 -CR-L2 regions. 

Point Mutations 

More subtle alterations of IR are the mutations Phe3 83 Val and Asp9 1 9Glu, both of 
which impair TK action 5,20 * 21 . Phe383 is midway in the L2 domain 10 , which in the model is 
straddled by the FnO linkage to the a-cc Cys524 disulphidebond and by the CR cam region 
of the partner monomer that contacts the Fn2/TM region. The Asp9 1 9Glu mutation is at the 
C-terminal edge of the Fn2 domain of the p subunit, which in the model contacts the cam. 
Size modifications in either of these complementary extracellular contact sites may prevent 
proper mating of the intracellular TK domains. 

Other aspects of the function of IR that can be explained by the arrangement of the 
domains in the 3D structure include the negative or positive cooperativity of binding of 
insulin to native or mutant receptors 22 " 24 , the loss of intracellular TK activity from the 
extracellular Cys647Ser mutation 2 , the effect on extracellular binding of insulin by the 
intracellular TK mutant Met 1 1 5311c 23 , the predominantly passive role of the transmembrane 
region 26 " 28 , and the relative down-stream kinase activity of monomelic and dimeric IR 29 . 

As three further tests, the model predicts (a) that an antibody linking the two TK 
domains at their most distal intracellular ends to induce transphosporylation, would increase 
the high affinity binding of insulin; (b) that a helix breaking amino acid in the 
transmembrane region would affect TK activation without modifying insulin binding 
characteristics; and (c) that a genetically engineered shift of the cam bulge via judicious 
insertion/deletionmutations would invert the response to insulin such that TK activation 
would be constitutive, but abrogated in the presence of the ligancL 
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Example 10 - Method of Identifying Modulators 

The three dimensional atomic structure can be readily used as a template for 
selecting potent modulators. Various computer programs and databases are available for 
the purpose. A good modulator should at least have excellent steric and electrostatic 
complementarity to the target, a fair amount of hydrophobic surface buried and sufficient 
conformational rigidity to minimize entropy loss upon binding. The approach usually 
comprises several steps. 

One must first define a region to target. The ligand binding site of IR or an IR 
cam can be used, but any place that is essential to the IR activity could become a potential 
target Other protein targets include, but are not limited to, structural subdomains, 
epitopes, and functional domains. Since the fitted quaternary structure has been 
determined, the spatial and chemical properties of the target region is known. 

A compound is then docked onto the target. Many methods can be used to 
archive this. Computer databases of three-dimensional structures are available for 
screening millions of molecular compounds. A negative image of these compounds can 
be calculated and used to match the shape of the target cavity. The profiles of ionic, 
hydrophobic, hydrophilic, hydrogen bond donor-acceptor, and lipophilic points of these 
compounds can be calculated and used to match the shape of the target Anyone skilled 
in the art would be able to identify many small molecules or fragment as hits. 

One then utilizes linking and extending recognition fragments. Using the hits 
identified by above procedure, one can incorporate different functional groups or 
molecules into a single, large molecule. The resulting molecule is likely to be more 
potent and have higher specificity. It is also possible to try to improve the modulator by 
adding more atoms or fragments that will interact with the target protein. The originally 
defined target region can be readily expanded to allow further necessary extension. 
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A number of promising compounds can be selected through the process. They can 
then be synthesized and assayed for their agonizing or antagonizing properties. 
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The present invention has been described in detail and with particular reference to 
the preferred embodiments; however, it will be understood by one having ordinary skill 
5 in the art that changes can be made thereto without departing from the spirit and scope 
thereof. 

All publications, patents and patent applications are herein incorporated by 
reference in their entirety to the same extent as if each individual publication, patent or 
patent application was specifically and individually indicated to be incorporated by 
10 reference in its entirety. 
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We claim: 

1 . A method of identifying a compound that modulates insulin receptor activity, 
comprising producing a compound that interacts with all or part of the fitted 
quaternary structure of insulin receptor or a fragment or derivative thereof and 

5 which thereby modulates insulin receptor activity. 

2. The method of claim 1 , further comprising synthesizing the compound. 

3. A method of identifying a compound that modulates insulin receptor activity, 
comprising comparing the structure of a compound for modulating insulin 
receptor activity to all or part of the fitted quaternary structure of insulin receptor 

10 or a fragment or derivative thereof to determine whether the compound is likely to 

modulate insulin receptor activity. 

P 4. The method of claim 1 or 3, further comprising determining whether the 

M compound modulates the activity of the insulin receptor or a fragment or a 

... 

|4 derivative thereof having insulin receptor activity in an in vivo or in vitro assay. 

S 15 5. The method of claim 1 or 3, wherein the compound comprises an insulin receptor 

P%J 

M agonist or an IR antagonist. 

Hi 

6. The method of claim 1 or 3, wherein the fitted quaternary structure of insulin 
receptor comprises substantially the entire fitted quaternary structure of insulin 
receptor. 

20 7. The method of claim 1 or 3, further comprising: 

a) introducing into a computer program information defining a ligand 

binding site conformation including at least one residue from monomer A 
in Table I and at least one residue from monomer B in Table I, the ligand 
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binding site defined by the approximate amino acid distances listed in 
Table I, wherein the program displays the quaternary structure thereof; 



b) comparing the structural coordinates of the compound to the structural 
coordinates of the ligand binding site and determining whether the 
5 compound fits spatially into the ligand binding site and is capable of 

changing insulin receptor from an inactive conformation to an active 
conformation or biasing insulin receptor toward an active conformation; 

wherein the ability to change insulin receptor from an inactive conformation to an 
active conformation or bias insulin receptor toward an active conformation is 
10 predictive of the ability of the compound to agonize insulin receptor activity. 

O 8 The method of claim 7, further comprising preparing the compound that fits 

spatially into the ligand binding site and determining whether the compound 
j^x agonizes insulin receptor activity in an insulin receptor activity assay. 

9. The method of claim 1 or 3, further comprising: 



SI 

in 



15 a) introducing into a computer program information defining a ligand 

Hi 

jJ7 binding site conformation including at least one residue from monomer A 

W in Table I and at least one residue from monomer B in Table 1, the ligand 

i8 binding site defined by the approximate amino acid coordinates listed in 

Table 1, wherein the program displays the quaternary structure thereof; 

20 b) comparing the structural coordinates of the compound to the structural 

coordinates of the ligand binding site and determining whether the 
compound fits spatially into the ligand binding site and is capable of 
changing insulin receptor from an active conformation to an inactive 
conformation or biasing insulin receptor toward an inactive conformation; 
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wherein the ability to change insulin receptor from an active conformation to an 
inactive conformation or bias insulin receptor toward an inactive conformation is 
predictive of the ability of the compound to antagonize insulin receptor activity. 

The method of claim 9, further comprising preparing the compound that fits 
spatially into the ligand binding site and determining whether the test compound 
antagonizes insulin receptor activity in an insulin receptor activity assay. 

The method of claim 1 or 3, further comprising: 

a) introducing into a computer program information defining a cam including 
at least one residue from the Cam-loop segment in Table 2 and at least one 
residue from the LI surface in Table 2, wherein the program displays the 
quaternary structure thereof; 

b) comparing the structural coordinates of the compound to the structural 
coordinates of the cam and determining whether the compound interacts 
with the cam and is capable of changing insulin receptor from an inactive 
conformation to an active conformation or biasing insulin receptor toward 
an active conformation; 

wherein the ability to change insulin receptor from an inactive conformation to an 
active conformation is predictive of the ability of the compound to agonize insulin 
receptor activity. 

The method of claim 11, further comprising preparing the compound that interacts 
with the cam and determining whether the test compound agonizes insulin 
receptor activity in an insulin receptor activity assay. 

The method of claim 1 or 3, further comprising: 

a) introducing into a computer program information defining a cam 

cpnfpnnation including at least one residue from the Cam-loop.segment in 

64 



Table 2 and at least one residue from the LI surface in Table 2, wherein 
the program displays the quaternary structure thereof; 

b) comparing the structural coordinates of the compound to the structural 
coordinates of the cam and determining whether the compound interacts 
with the cam and is capable of changing insulin receptor from an active 
conformation to an inactive conformation; 

wherein the ability to change insulin receptor from an active conformation to an 
inactive conformation or bias insulin receptor toward an inactive conformation is 

predictive of the ability of the compound to antagonize insulin receptor activity. 

i 

The method of claim 13, further comprising preparing the compound that interacts 
with the cam and determining whether the test compound antagonizes insulin 
receptor activity in an insulin receptor activity assay. 

The method of any of claims 1 or 3, wherein the insulin receptor is bound to 
insulin. 

A computer medium having recorded thereon data of an insulin receptor, said data 
sufficient to model all or part of the fitted quaternary structure of the receptor. 

The computer medium of claim 16, wherein the data comprises structural 
coordinates of an IR receptor, the coordinates sufficient to model all or part of the 
quaternary structure of the receptor. 

The computer medium of claim 16, wherein the quaternary structure of the 
receptor comprises substantially all of the quaternary structure of the receptor. 
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Abstract 

The invention includes the fitted quaternary structure of insulin receptor. It also 
includes methods of identifying compounds that modulate insulin receptor activity by 
producing a compound that interacts with all or part of the fitted quaternary structure of 
5 insulin receptor or a fragment or derivative thereof and which thereby modulates insulin 
receptor activity. 
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Human Insulin Receptor 

Leader Sequence 

MGTGGRRGAA AAPIiLVAVAA L1XGAAG 



M 

m 



Alpha aubunit 

HLYPGEVCPG MDIRNNLTRIi HBLE»CSVTE GHLQILLMFK TRPEDFRDLS SO 

FPKLXMXTDY IXLFHVYGLE Sl#KOLPPNI*T VIRGSRLFFK YALVIFEMVH 100 

LKELGLZUI*M NITRGSVRIE KNWELCYIAT IDWSRILDSV EDNHIVLKKD 150 

DNEECGDICP GTAKGKTNCP ATVINGQFVE RCWTHSHCQK VCPTICX5HC 200 

CTAEGliCCHS ECLGHCSQPD DPTKCVACRH FYIiDGRCVET CPPPYYHFQD 250 

WRCVUFSFCQ DLHHKCKNSR RQGCHQYVIH NNKCIPECPS GYTMNSSHLL 300 

CTPCLGPCPK VCHLI*EGEKT IDSVTSAQEL RGCTVISGSL IINIRGGNNL 350 

AAELEAHI*GL IEBISGYI*KI RRSYALVSLS FFRKLRLIRG ETLEIGNYSF 400 

YALDNQHLRQ LWDWSKHNLT TTQGKLFPHY NFKLCLSEIH KMEEVSGTKG 450 

RQERNDIALK THQDXASCEN ELLKFSYIRT SFDBCELLRWE PYWPPDFKDL 500 

LGFHLFYKEA PYQHVTEFDG QDACGSNSWT WDIDPPLRS NDPKSQNHPG 550 

WLMRGIiKPWT QYAZrVKTLV TFSDERRTYG AKSOIIWQX DATNPSVPLD 600 

PISVSNSSSQ IILKWKPPSD PNGNITHYLV FWERQAEDSE LFELDYCTKG 650 

LKLPSRTWSP PFBSEDSQKH NQSEYEDSAG ECCSCPKTDS QILKELEESS 700 

PRSTFSDYIaH HfWFVPRPS 719 
Cutting site 

R KRR 723 

Beta subunit 

SLGDVGN VTVAVPTVAA FPOTSST5VP 750 

TSPEEHRPFE KVVNKE SLVT SGLRHFTGYR IELQACNQDT PEERCSVAAY 800 

VSARTMPEAX ADDIVGPVTH EIFENNWHL MWQEPKEPNG LXVZiYEVSYR 850 

RYGDEELHLC V5BKHFALER GCRLRGLSPG NYSVRIRATS LAGNGSWTEP 900 

TYPYVTDYID VPSNXAKIII GPLIFVFLFS WIGSIYLFL RKRQFDGPLG 950 

PLYASSNPEY LSASDVFPCS VYVPDEWEV5 REKITI/LREI* GQGSFGMVYE 1000 

GHARDIIKGE AETRVAVKTV WBSASLRERI EFLNEASVMK GFTCHHWRL 1050 

LGWSKGQPT LWMELMAHG DLKSYLRSLR PEARNOTGRP PPTLQEKIQM 1100 

AAEIADGMAY I*NAKKFVHRD LAARNCMVAH DFTVK I GD FG MTRDIYETDY 1X50 

YRKGGKGLLP VRWMAPBSI*K DGVFTTSSDK WSFGWI.WEI TSLAEQPYQG 1200 

LSNEQVLKFV MDGGYLDQPT3 NCPERVTDLM RMCWQFHPKM RPTFLEIVML 1250 

LKDDLHPSFP' EVSFFHSEEN KAPESEELEM EFEDMEKVPL DRSSHCQREE 1300 

AGGRDGGSSL GFKRSYEEHI P YTHMNGGKK NGRILTLPRS NPS ~ 1343 



Fig. 12. 



THISPWSEBUUHU*™ 



i 




